Find answers from the community

Updated last year

Are there any viable approches for this

At a glance

The community member is asking for viable approaches to update an ElasticSearchVectorstore index with only the PDF files that have changed or been removed, without having to re-index all the files again, in order to save on embedding costs. Another community member provided an example for a similar issue when using a vector database integration, mentioning that the process is more complicated and requires persisting/maintaining extra files. The example is for ChromaDB, but the community member notes it should work for any vector store.

Useful resources
Are there any viable approches for this issue: When loading all the documents for an index using a PDF loader, how can I only update the index with PDF files that have changed or have been removed using the ElasticSearchVectorstore without re-indexing all the files again, thus saving embedding costs?
L
2 comments
I wrote an example for this the other day

When using a vectordb integration, the process is a little more complicated. You have to persist/maintain extra files

https://discord.com/channels/1059199217496772688/1163880111074971790/1163900056718553169
(The example is for chromadb, but it should work for any vector store)
Add a reply
Sign up and join the conversation on Discord