Find answers from the community

Updated 3 months ago

I have a function that finds the real id

I have a function that finds the real id of a node from a custom id I added in the metadata when loading the documents:

def get_real_node_id_from_custom_id(custom_id, index): for node in index.docstore.docs.values(): if node.metadata.get('custom_id') == custom_id: return node.id_ return None

This works fine when I use the default llamaindex vector store, but not when I use chromadb. The only difference in my code is how I create the storage context:

if my_vector_store == "default": storage_context = StorageContext.from_defaults() elif my_vector_store == "chromadb": chroma_client = chromadb.PersistentClient(path="./storage") chroma_collection = chroma_client.create_collection("quickstart") vector_store = ChromaVectorStore(chroma_collection=chroma_collection) storage_context = StorageContext.from_defaults(vector_store=vector_store)

When using chromadb, it looks like index.docstore.docs is an empty {}.
How can I search my index for a specific metadata value when the vector store is chromadb?
W
L
3 comments
Yeah third party vector stores keep (text + embedding) in vector_store only. There is nothing in docstore in case of vector stores.

You could try fetching directly from chromaDB since you already have the chroma DB client chroma_client
you can also override that behaviour, and force the docstore to be used

Plain Text
index = VectorStoreIndex(....., store_nodes_override=True)
this means you need to also save/load the entire storage context (using persist_dir="./storage" or similar)
Add a reply
Sign up and join the conversation on Discord