Find answers from the community

Updated 3 months ago

Hi, is there an example somewhere of

Hi, is there an example somewhere of using any external vector store (pinecone, milvus, etc.) with stores_text=False and using an external document store (Mongo, redis, etc.)?
A lot node data is not used by the vector store for search and the resource cost of most vector stores, so this seems like it should be a standard pattern for scaling, but I can't find much documentation on it or any examples.
L
M
5 comments
I don't have an example, but should be fairly straightforward. The only reason its not by default is to simplify storage (relying on just the vector store is a pretty attractive pattern)
Plain Text
vector_store = ...
vector_store.stores_text = False
Thanks Logan, but even when explicitly setting the vector_store to false, all of the fields still get stored in the vector store. The vast majority of the data is not necessary for the vector search.

So my question is really, how do I avoid storing all this data in the vector store. I'm talking about fields like "_node_content" and the metadata. I'm using Sentence window retrieval with a small window size, so I have a ton of duplicated text in these fields. They up over 95% of the in-memory storage in the vector store, which is why I want to pull them from a document store instead.

Do I have to write a custom vector-store module (or modify the existing one) for my vector store of choice (Milvus in my case)? Or is there an option that will disable saving everything except for the search fields and IDs?
Hmm, I think you'd have to write a custom subclass. Looking at it now, the .add() method will store it regardless, the .stores_text attribute is for external functions that use the vector store
Got it.
Thanks Logan!
Add a reply
Sign up and join the conversation on Discord