DocumentSummaryIndex
. However, I have not found a single example that stored this index in persistent storage such as a Chroma Database. Of course, I could write custom tools, but would rather not. Has anybody stored this type of index on a file system? I am interested in working examples. Since a summary index can be expensive to compute if there are many long documents, and it is meant to be reused many times, I surmise that such examples must exist. I am working on my laptop (i.e., not in the cloud). Thanks.DocumentSummaryIndex
together with Chroma, and I fear I have a seroius misunderstanding. All examples I have see that discuss this particular index do so using VectorStoreindex
. The DocumentSummaryIndex
is composed of nodes and summaries. Given a set of documents, I chunk them into nodes. I then save these nodes into a Chroma database with the idea to reload them at a later time to consturct my DocumentSummaryIndex.
Since I know that indexes can be persisted, I figured that the DocumentSummaryIndex
could be stored in the Chroma Databse. Is this correct, or am I mistaken. If I the former, I would really appreciate a minimum working example that demonstrates saving nodes and index to the database and reloading the data. I am working 100% with open source models.from llama_index.core import DocumentSummaryIndex, StorageContext from llama_index.core.storage.docstore import SimpleDocumentStore from llama_index.core.storage.index_store import SimpleIndexStore # could also use redis, mongodb, docstores/index stores docstore = SimpleDocumentStore() vector_store = ChromaVectorStore(...) index_store = SimpleIndexStore() # if left blank, they default to the "simple" versions, just showing this for consistency storage_context = StorageContext.from_defaults( vector_store=vector_store, docstore=docstore, index_store=index_store ) index = DocumentSummaryIndex.from_documents(documents, ..., storage_context=storage_context) # chroma saves automatically # simple document store saves to disk, others would save automatically as well index.storage_context.persist(persist_dir="./storage") # then load -- provide vector store since its not saved to disk from llama_index.core import load_index_from_storage index = load_index_from_storage(StorageContext.from_defaults(persist_dir="./storage", vector_store=vector_store))