Find answers from the community

Updated last year

If I have lists in my documents which

If I have lists in my documents which are longer than the node token count, does that mean that parts of my list are being separated from their original context? As in, no longer attributed with the header of the list or the description which comes before? Would a good way to fix this be to change every entry in the list to a full sentence explaining the item’s relation (x1 is in set y, x2 is in set y, etc.)

5 comments

LLogan M

Yea it would get split, although it would have to be a crazy long list 😅

That might be a good workaround yea

MMaxx

How long a list would it need to be? What is the maximum token count you can have in a node? I have lists that are sometimes 390+ lines with ~5 tokens each, do you think that is enough to be problematic?

LLogan M

Depends on the chunk size I suppose

By default the chunk size is 1024 tokens

MMaxx

Can I increase the chunk size to whatever I want? Or is there a hard limit? Thank you very much for taking time to answer my questions

LLogan M

You can increase it to whatever you want, and llama-index should handle it

But if it's too big to fit into the LLM, llama-index still has to chunk it again at query time

Add a reply