Find answers from the community

Updated last year

How do I inspect the reference docs that

At a glance

The community member is trying to inspect the reference documents they are inserting into a Chroma vector store. They have created a persistent Chroma vector store and are refreshing the reference documents, but they are unsure if the documents are being properly stored and saved to disk. The comments suggest that the refresh_ref_docs() function returns a list of booleans indicating whether the documents were inserted or updated. However, the community members are unsure how to further inspect the vector store. They mention finding a Chroma Reader, but are unclear on how it differs from using vector_store.as_query_engine(). The community members are advised to read the Chroma documentation, as it may provide more information on inspecting the vector store.

Useful resources
How do I inspect the reference docs that I'm attempting to insert into a chroma vector store? I'm creating a persistent chroma vector store and then loading it and refreshing the ref docs with the following, but I can't tell if the docs I'm passing to refresh are actually being stored and saved to disk properly.
vector_store = ChromaVectorStore(chroma_collection=chroma_collection) storage_context = StorageContext.from_defaults( docstore=SimpleDocumentStore.from_persist_dir(storage_params["persist_dir"]), vector_store=vector_store, index_store=SimpleIndexStore.from_persist_dir(storage_params["persist_dir"]), ) service_context = ServiceContext.from_defaults(callback_manager=callback_manager, llm=llm, embed_model=OpenAIEmbedding(embed_batch_size=50), node_parser=node_parser) vector_index = VectorStoreIndex([], storage_context=storage_context, service_context=service_context, store_nodes_override=vector_index_params["store_nodes_override"]) results = vector_index.refresh_ref_docs(data)
Now that I've run refresh_ref_docs() how do I verify the refresh worked and persisted the docs/nodes/embeddings? thanks kindly!
L
t
4 comments
results = vector_index.refresh_ref_docs(data) will return a list of booleans, where if it's true that means that the document inserted as a new document or updated

Other than that, I'm not sure if chroma has any utilities allowing you to inspect the vector store, might have to read some chroma docs for that πŸ˜…
ok thanks Logan. I did find a readme about a Chroma Reader, have you used that? I'm unclear how the reader differs from a vector_store.as_query_engine() https://gpt-index.readthedocs.io/en/stable/community/integrations/vector_stores.html#vector-store-index I'll see if I can figure this Reader out. thanks!
I haven't used the reader, but it looks a little clunky haha

I guess I meant reading the chroma docs instead. The collection has a get function which might be helpful https://docs.trychroma.com/reference/Collection#get
ok will do. cheers /0_o
Add a reply
Sign up and join the conversation on Discord