Find answers from the community

Updated 4 months ago

Managing document updates in production: seeking automation and optimal database solution

At a glance

The community member is facing challenges with managing document updates in production, as they have a client with frequent new documents. They are looking for the best way and vector database to handle updating old documents with new ones automatically. The comments suggest using a vector store like Qdrant, which can handle large amounts of documents. One community member recommends removing the previous collection and pointing the vector index to the new collection, running a cron job to create new collections in the background. Another community member asks for clarification on whether the "document updates" refer to updating existing documents or inserting new ones, and provides a suggestion to tag the nodes with a hash or filename and delete all nodes by that identifier when updating a document.

also i found out that managing document updates is so hard in production, i have a client that very often they have new documents, and as for now it has been manual but later i want to make it in automatic fasion, still not sure what is the best way and best vector database to use to handle upadting old documents with new ones
W
n
m
4 comments
Sell as in?

For new docs, you can always insert in the existing index.
For the large amount of docs, I would suggest using vector store ( Qdrant works like a charm for me )

Also if you have updates in the same documents, I prefer removing the previous collection and point my vector index towards the new collection if there are lot of documents ( unnecessary conditional checking)

I run a cron job which in the background creates the new collections and once its created i simply point index towards it and del the previous one.
with "document updates" do you mean updating existing docs that may have changed or inserting new docs?
for updating existing you can tag the nodes with hash or filename and then when a document is updated you delete all nodes by hash or filename and insert he new nodes
thanks, i will check Qdrant now, by "sell" I mean has anyone being paid for RAG solution in production.
Add a reply
Sign up and join the conversation on Discord