Find answers from the community

Updated 12 months ago

Hi @Logan M, if I have some documents 50

At a glance

Hi @Logan M, if I have some documents 50-100 pages, split in chunks of 1024, and I want to send full document to llm, will be ok to send all chunks or is better to create a new index with SummaryIndex.from_documents() and use this in query engine ? What option will have better results or will be the same ? or there is a better option ? Thanks

9 comments

LLogan M

you can send them all to the llm, the main thing is just using a response synthesizer that works for you

If you are trying to summarize, I would use response_mode="tree_summarize"

AAndrei

if I don't have context window limitation, using Claude 100k will response_mode required ?

AAndrei

And if will have chunk overlap there can be any performance degradation sending all chunks versus, using SummaryIndex and send only one chunk with all document ?

LLogan M

theres always some response window limitation, so there will always be a response mode (whether thats tree summarize, or some other mode)

Hard to say on that second one. It depends on the LLM

AAndrei

I'm unsure whether sending all chunks would involve merging them into a larger chunk with the size of the context windows in my case 100k for Claude 2.1, or they're simply sent as a big batch without merging ?

LLogan M

using tree_summarize or compact response modes, it will already be stuffing the LLM input with as much retrieved text as possible (regardless of chunk size)

AAndrei

super, thank you, I will do some test to see any difference in response

AAndrei

If I want to retrieve all nodes, there is any way to configure query_engine to retrieve all nodes and not perform any similarity search ?

vector_store = PGVectorStore.from_params()
index = VectorStoreIndex.from_vector_store()
query_engine = index.as_query_engine(
similarity_top_k=all_nodes
)

LLogan M

not really, unless you set a huge top-k

Add a reply