Find answers from the community

Updated 3 months ago

anyone successfully persist and load a

anyone successfully persist and load a chroma vector_store+docstore? I've tried everything I can think of but can never load anything. Please help! I'm totally stuck. I can see the "chroma.sqlite3" file and the docstore/index_store and after running vector_index.storage_context.persist(persist_dir=persist_dir). but I can never get any of it to load back. I've tried the following to no avail. persist_dir2 = "C:/projects/technical-notes-llm-report/data/06_models/" chroma_client2 = chromadb.PersistentClient(path=persist_dir2) chroma_collection2 = chroma_client2.get_or_create_collection(collection_name) vector_store2 = ChromaVectorStore(chroma_collection=chroma_collection2) storage_context2a = StorageContext.from_defaults( docstore=SimpleDocumentStore.from_persist_path("C:/projects/technical-notes-llm-report/data/06_models/docstore.json"), vector_store=vector_store2, index_store=SimpleIndexStore.from_persist_path("C:/projects/technical-notes-llm-report/data/06_models/index_store.json"), ) vector_index2 = VectorStoreIndex.from_vector_store(vector_store2, storage_context=storage_context2a, service_context=service_context, store_nodes_override=True) vector_index3a = VectorStoreIndex([], storage_context=storage_context2a, store_nodes_override=True) vector_index3a.ref_doc_info -> {} vector_index3b.ref_doc_info -> {} docstorea = storage_context2a.docstore docstorea.get_all_ref_doc_info() -> {} thanks for any insight!
b
L
t
6 comments
vector_index2 should have the documents in it?
I have figured this out, a little tricky
Plain Text
from llama_index import VectorStoreIndex, Document, StorageContext
from llama_index.vector_stores import ChromaVectorStore
import chromadb

client = chromadb.PersistentClient(path="./chroma_db")
collection = client.get_or_create_collection("new")
vector_store = ChromaVectorStore(chroma_collection=collection)

storage_context = StorageContext.from_defaults(vector_store=vector_store)

index = VectorStoreIndex.from_documents([Document.example()], storage_context=storage_context, store_nodes_override=True)
index.storage_context.docstore.persist(persist_path="./chroma_db/docstore.json")
index.storage_context.index_store.persist(persist_path="./chroma_db/index_store.json")

print(index.ref_doc_info)

from llama_index.storage.docstore import SimpleDocumentStore
from llama_index.storage.index_store import SimpleIndexStore

new_storage_context = StorageContext.from_defaults(
    docstore=SimpleDocumentStore.from_persist_path("./chroma_db/docstore.json"),
    index_store=SimpleIndexStore.from_persist_path("./chroma_db/index_store.json"),
    vector_store=ChromaVectorStore(chroma_collection=collection),
)

from llama_index import load_index_from_storage

new_index = load_index_from_storage(new_storage_context, store_nodes_override=True)

print(new_index.ref_doc_info)
This works, the loaded index has all the info
I'll give it try right now thank you for digging into this Logan!
ok, I'm back in business! Finally can save and load chroma backed vector stores. Thank you!
Add a reply
Sign up and join the conversation on Discord