Find answers from the community

Updated 2 years ago

what do you mean max input size is 40

At a glance
what do you mean "max input size is 40% for most models"? i.e. context + prompt can't be more than 40% of the total tokens a model supports?
L
c
13 comments
The max input size is 4096 tokens (not 40% heh)

Yea this means the total number of text sent to openai has to be 4096 - num_output tokens in length (which llama index handles for you)

So if you increase num_ouput (and max_tokens on the LLM), you are subtracting from how large the context can be from your documents
crap sorry -i just can't read
by default, num_output/max_tokens is 256
yup the rest of that makes sense
haha no worries, sounds good!
https://gpt-index.readthedocs.io/en/latest/guides/tutorials/terms_definitions_tutorial.html#extracting-and-storing-terms qq here though - why is max_input_size here 4096? given da vinci's token there'd only be one token left for output?
Yea, max_input_size is the max size possible for the model, not the max size that llama index sends to openai (a little confusing naming, I know)
llama index uses that to compute a bunch of stuff under the hood ๐Ÿง 
well while i have you here - what do chunk size limit (in service context) and max chunk overlap (prompt helper) do?
in the prompt helper, it makes it so every piece of context is at most chunk_size_limit tokens long

If it needs to split the retrieved context, it will split it by overlapping the tokens the configured amount

If you also set the chunk_size_limit directly in the service context, it will also ensure the nodes created from your documents are at most that size. (but it overlaps by 200 tokens by default, which can be configured in the node parser)
Loooots of settings once you start digging into it lol
cool that generally makes sense though thanks.
Add a reply
Sign up and join the conversation on Discord