Find answers from the community

Updated 3 months ago

Hi Team,

Hi Team,

Issue: Very long time being taken to preprocess a txt file which is 50 MB in size.

Explanation on the issue:

In our preprocess flow (adding to knowledge base flow), we tried uploading a 58MB txt file. The file was broken into 80k chunks which needed to be uploaded into our pinecone vector store with llama_index wrappers.

We are seeing that the storage_context.docstore.add_documents()function is taking a very long time in getting executed.
After that, the GPTVectorStoreIndex(nodes, storage_context, service_context)is also taking a very long time.

My interepretation on it:

I think the number of chunks (80k+) is causing this slowness and our document getting "stuck" in the process. Not sure how to fix this because we have been using the same chunk size and text splitter for months and they have performed really well.

Can someone help us with it? Any ideas on how to scale up in such cases?
W
K
4 comments
Also having this same issue, this was after the update
Might also be a today issue because of the outage
Add a reply
Sign up and join the conversation on Discord