Find answers from the community

Updated 3 months ago

Metadata

Hello all, I am using pinecone vector db with llamaindex. I am trying to understand the metadata that is created when upserting to pinecone this is the metadata. I want to understand the difference between doc_id, document_id, and ref_doc_id.

document_id and ref_doc_id have the same value so it seems kind of redundant.
I am also unsure about the _node_content

Any insight is greatly appreciated.

METADATA:
_nodecontent: "{"id": "a645c9ae-52a0-4702-b347-bf1c197fdd06", "embedding": null, "metadata": {"page_label": "16", "file_name": "LABORCODE.pdf"}, "excluded_embed_metadata_keys": [], "excluded_llm_metadata_keys": [], "relationships": {"1": {"node_id": "7bffb2e4-e08e-43a1-8027-5bd356e4c521", "node_type": null, "metadata": {"page_label": "16", "file_name": "LABORCODE.pdf"}, "hash": "431f995dbad545f4735413d1f065446f4cc2f9b5209ffd944d6a9f6ac03cec12"}}, "hash": "431f995dbad545f4735413d1f065446f4cc2f9b5209ffd944d6a9f6ac03cec12", "text": \ <DELETED FOR BREVITY> ", "start_char_idx": null, "end_char_idx": null, "text_template": "{metadata_str}\n\n{content}", "metadata_template": "{key}: {value}", "metadata_seperator": "\n"}"

doc_id: "7bffb2e4-e08e-43a1-8027-5bd356e4c521"

document_id: "7bffb2e4-e08e-43a1-8027-5bd356e4c521"

file_name: "LABORCODE.pdf"

page_label: "16"

ref_doc_id: "7bffb2e4-e08e-43a1-8027-5bd356e4c521"
L
2 comments
All 3 are the same thing lol
For legacy purposes, it's stuck around. We moved all the node to json code to a single function

Before, every vector store was doing it slightly differently, hence all the names
Add a reply
Sign up and join the conversation on Discord