Do I need to reload an index after every

At a glance

The community members discuss the need to reload an index after ingesting new data. The main points are:

- For in-memory vector stores, the index does not need to be reloaded after each insertion. The index can be saved and reused for subsequent queries.

- For third-party vector databases like Chroma or Qdrant, the connection to the database is only made when loading the index, so there is no need to reload the index after insertions.

- However, in a multi-user system where different users are querying different sets of documents, the index may need to be loaded for each user request to ensure the latest data is available.

The community members provide suggestions on the best approach for dynamic insertions and managing index loading in different scenarios.

PPujari12

Do I need to reload an index after every ingestion?

16 comments

WWhiteFang_Jr

No, but may need to create query_engine instace after insertion

PPujari12

I am saving the index object at the start and then doing the insertion, creating query engine instances everytime I am doing query. Will this work?

WWhiteFang_Jr

if you are inserting after saving the index then the latest insertion might not get saved,

"creating query engine instances everytime I am doing query. Will this work?" -
Yeah this part is fine

PPujari12

then what's the best way to do dynamic insertion ? Or is it the case that we need to save the index after every insertion

WWhiteFang_Jr

Not after every insertion, Once all the insertions has taken place, you can do persist part.

Persisting takes time to be completed

PPujari12

I am using a vector DB, and hence want to save the index instance in-memory for reduced latency using querying? And In my usecase, the user can insert to an index over time, so can't do the "save index after all insertions" :/.

WWhiteFang_Jr

Ah okay, So with Vector DB you dont have to reload! once it is inserted. you are ready to use it on the go

WWhiteFang_Jr

lol I should ask in the beginning if it is for local vector index or third party 😅

PPujari12

haha

HHarshvardhan

@WhiteFang_Jr
Suppose we have a third-party vector store such as Chroma or Qdrant. How can we add a new document to an existing collection in the store? Currently, I am building the index from the existing collection and then adding the new documents to the index. Is this the best approach, or is there a better way to accomplish this?

Plain Text

splitter = SentenceSplitter(chunk_size=1024, chunk_overlap=128)
nodes = splitter(docs)
index.insert_nodes(nodes=nodes)

WWhiteFang_Jr

Yeah, this works for both, in memory vector store index and third party vector DBs.

PPujari12

I think we don't need to build the index everytime if we are using a vectorDB, we can save the index instance in memory and use in whenever required. @WhiteFang_Jr Please correct me if this is wrong

WWhiteFang_Jr

Yep, actually for third party VB, only connection is made while loading up the indexes. so there is no loss in there.
In case you are loading up index from local disk and it is going to load all the embeddings and nodes then it will increase the time for loading thus increasing time for every query response.

For third party both can be done. I prefer keeping it in memory.

HHarshvardhan

When you are building a product and you have different query requests for different documents, I believe we need to load from the local disk for every request. Isn't it? Or is there any better way to manage this?

WWhiteFang_Jr

Not necessarily, if the system is like you are adding documents to the existing index, you dont need to load index everytime.

You create the index during starting of the server and now every new document can be inserted into that index and there is no need to reload the index.

HHarshvardhan

It's true, but if we have a multi-user system where multiple users can ask questions about their documents concurrently, we may need to load the index for each user. Right?

Add a reply

Find answers from the community

Do I need to reload an index after every