Find answers from the community

Updated 5 months ago

Update index

At a glance

The community member has a persisted database created from a folder of PDFs and wants to update the index to include a new file added to the folder, without recreating the entire index. They also want the new file to be persisted to disk, and the entire process to be automatic. A community member responds with a solution: they can update the existing index by loading the new data, adding it to the existing index, and persisting the updated index. This can be done automatically each time a new file is added.

quick question: I have a persisted database created from a folder of pdfs. Now I add another file to the folder, and I want to update the index to include it, without recreating the entire index. I also want the new file to be persisted to disk, and if possible I want the entire process to be automatic. Is there a simple way to do this?
W
h
2 comments
Yes you can update the existing index

Plain Text
# existing_index
index_dev_docs = GPTVectorStoreIndex.from_documents(dev_docs, service_context=service_context)

# new data
new_docs = SimpleDirectoryReader(NEW_DOCS_PATH).load_data()

# Add to existing index 
for docs in new_docs:
    index_dev_docs.insert(docs)

# persist updated index
index_dev_docs.storage_context.persist()


#NOTE: Update query_engine instanc
query_engine = index_dev_docs.as_query_engine()


Add this in your code, every time you will add new file it will be added to your index and persisted as well
awesome, thanks!
Add a reply
Sign up and join the conversation on Discord