Find answers from the community

Updated 2 months ago

Managing document updates in production: seeking automation and optimal database solution

also i found out that managing document updates is so hard in production, i have a client that very often they have new documents, and as for now it has been manual but later i want to make it in automatic fasion, still not sure what is the best way and best vector database to use to handle upadting old documents with new ones
W
n
m
4 comments
Sell as in?

For new docs, you can always insert in the existing index.
For the large amount of docs, I would suggest using vector store ( Qdrant works like a charm for me )

Also if you have updates in the same documents, I prefer removing the previous collection and point my vector index towards the new collection if there are lot of documents ( unnecessary conditional checking)

I run a cron job which in the background creates the new collections and once its created i simply point index towards it and del the previous one.
with "document updates" do you mean updating existing docs that may have changed or inserting new docs?
for updating existing you can tag the nodes with hash or filename and then when a document is updated you delete all nodes by hash or filename and insert he new nodes
thanks, i will check Qdrant now, by "sell" I mean has anyone being paid for RAG solution in production.
Add a reply
Sign up and join the conversation on Discord