Find answers from the community

I
Itamar
Offline, last seen 4 months ago
Joined September 25, 2024
Is it necessary to create Documents if my preprocessing pipeline outputs chunks? ie I have some unique data type that is input into my preprocessing pipeline, that pipeline outputs chunks of each data sample with associated metadata for each chunk. Can I just create Nodes and insert those nodes into my index? Better yet, can I create a Document and put nodes inside of it?
2 comments
L
W
I'm trying to insert a Document into an index but having some issues. My document contains metadata that I want to use after retrieval but that is not a string and should not be used to retreiver or query any LLMS. Here is what one document looks like:
Plain Text
Document(
                text=block["Text"],
                metadata=dict(
                    metadata,
                    **{
                        "page_num": str(block["Page"]),
                        "BoundingBox": block["BoundingBox"],
                    },
                    excluded_embed_metadata_keys=["BoundingBox"],
                    excluded_llm_metadata_keys=["BoundingBox"],
                ),
            ) 

This is the error message I get though:
Plain Text
{'error': [{'message': "invalid text property 'boundingBox' on class 'InvalidityGPTI': not a string, but map[string]interface {}"}]}


I cant seem to figure out where this is coming from

This is how I insert into my index:
Plain Text
self.index = VectorStoreIndex.from_vector_store(
                service_context=self.serviceContext,
                vector_store=WeaviateVectorStore(
                    weaviate_client=self.client, index_name="InvalidityGPTI"
                )
            )

self.index.insert(doc)
self.index.storage_context.persist()
3 comments
I
L
Can someone explain the different between setting the text_qa_template in a response synthesizer and the system_prompt in the LLM used in the response synthesizer?

The behavior I'm looking to achieve is to tell my LLM what they are and then set an example prompt and example answer. I am using the CitationQueryEngine because I also want to know what the citations for the query are.
12 comments
L
I