Find answers from the community

Updated 3 months ago

What is the trade off space between

What is the trade-off space between chunk size and LLM tokens?

I have been playing around with optimizing this, and there seems to be a floor of query performance along chunk size depending on document size. Increasing chunk size increases LLM tokens sent for query response however.

I am thinking of parameterizing chunk size to be functional with document size and optimize search queries based on that but would appreciate general thoughts to vet the concept.
j
y
14 comments
if you're using the GPTSimpleVectorIndex, if you reduce chunk size, make sure to increase similarity_top_k!
because by default we only fetch one chunk (so if chunks are super small they won't be informative)
Yeah thats exactly what I am doing to reasonable effect
but if I cut chunks in half I have to more than double top_k which is suboptimal
(based on my anecdotal tests)
This compounds on especially large documents, and answer quality suffers
if you want smaller text chunks, but to inject document metadata, you can set the extra_info property in the Document object. This metadata will be injected into every text chunk
Yeah and I could do a top_k and then search the document directly to fetch more context maybe
yep! also possible
What I am converging to is "I need a more advanced search algo that is less greedy"
You can also try defining a List index for each document, and then defining a simple vector index on top of the subindices through composability. https://gpt-index.readthedocs.io/en/latest/how_to/composability.html. then when you retrieve a top-k "chunk" it'll route the query to the underlying list index which will synthesize over the entire document
Arent list index queries O(N) though?
If so I think I need to find a way that makes this sub-linear somehow, because I think directionally youre right, but I dont want to have to batch queries ahead of time (though that is an interesting concept for really good answers)

I could get the most asked questions and pre-render them.
Add a reply
Sign up and join the conversation on Discord