Find answers from the community

Updated 3 months ago

Yea

I have a milvus vector database and mongodb database for docstore and indexstore. I have created an index with VectorStoreIndex.from_documents using the following StorageContext:

Plain Text
storage_context = StorageContext.from_defaults(vector_store=vector_store,
                                                   docstore=docstore,
                                                   index_store=index_store)

    index = VectorStoreIndex.from_documents(docs, storage_context=storage_context, service_context=service_context, store_nodes_override=True)


So atm I have a collection in milvus and two databases and some collections in mongodb.

Is there a specific way to recreate this index later using the docstore and indexstore from mongodb and the specific collection from milvus?

I've been trying for awhile now to recreate the index exactly as I have it here, but nothing seems to give back the doc informaton in the docstore.

The closest I've been able to get is creating an index with the mongodb storage context but no vectorstore, or a vectorstore but without the storage context from mongodb.

Edit: The main issue seems to be that the source_nodes information is always pulled from the vector database even though that is not where the information is actually mostly stored which is in the docstore
L
p
4 comments
Yea our vector store integrations assume everything will be stored in the vector store itself.

You can override this though!

VectorStoreIndex.from_documents(documents, store_nodes_override=True)
I did try adding a storagecontext and the store_nodes_override=True when recreating the vector index later, but it still preferred to fetch doc data from the vector store instead of the mongodb databases.

The VectorStoreIndex.from_vector_store only appears to take a service context, not a storage context

should I just be able to create a VectorStoreIndex directly with:?
Plain Text
VectorStoreIndex(storage_context=load_storage_context, index_struct=index_struct, store_nodes_override=True)
Ah yea, from_vector_store is pretty new, seems like it was built assuming you were using only a vector store

you should also be able to setup the entire storage context (with your existing vector store, docstore, and index store) and pass it in like so

Plain Text
index = VectorStoreIndex([], storage_context=storage_context, service_context=service_context, store_nodes_override=True)
Ok thanks, I'll give that a try!
Add a reply
Sign up and join the conversation on Discord