Find answers from the community

r
ro
Offline, last seen 3 months ago
Joined September 25, 2024
@Logan M Big Docs are continuing to plague me with issues. When I create a DocumentSummaryIndex, this line grabs the first node's metadata and that ends up exceeding pinecone limits. Shouldnt this also add the exclude llm/embed field lists? I did try to add that but that embed exclusion filter seems to happen somewhere else...
30 comments
r
L
Hi there, have been working with ingestion pipelines, docstores and I am finding that for a large document with a large number of nodes there can be a significant performance hit when doing any document management like delete/add. This is because it does a put on every node action, in delete e.g. https://github.com/run-llama/llama_index/blob/a24292c79424affeeb47920b327c20eca5ba85ff/llama-index-core/llama_index/core/storage/docstore/keyval_docstore.py#L485), and depending on the number of remaining nodes, it can take a while. Would it make more sense to wait til all the nodes are removed before doing the put for ref_doc_info?
24 comments
r
L