QQ on what is happening under the hood

At a glance

The community member is confused about what is happening under the hood for step 6 of the tutorial on GPT-Index. They expected that vector_store.add(nodes) would add the embedding, metadata, and the original text to the Pinecone vector database, but upon inspection, they found that only the embedding and metadata were stored, not the content text. The community member's question is where the content text is stored and how it would be joined with the embedding and metadata at the retrieval step.

Another community member suggests that the content text is probably stored in the "document_store", but there is no explicitly marked answer in the comments.

Useful resources

hhusjerry

QQ on what is happening under the hood for step 6 this tutorial: https://gpt-index.readthedocs.io/en/latest/examples/low_level/ingestion.html#load-nodes-into-a-vector-store

So I thought vector_store.add(nodes) will add the embedding, metadata, along with the original text all to the pinecone vector DB, but upon inspecting on Pinecone console, it seems there is no content text stored, only embedding and metadata.

So my natural question is where is the content text stored? And how would they be joined together at retrieval step?

1 comment

bbmax

probably in the document_store

Add a reply

Find answers from the community

QQ on what is happening under the hood