Log in
Log into community
Find answers from the community
View all posts
Related posts
Did this answer your question?
π
π
π
Powered by
Hall
Inactive
Updated 3 months ago
0
Follow
Limit
Limit
Inactive
0
Follow
G
Gauri
last year
Β·
Hi everyone ! Can anyone tell me what is the max limit of max_new_tokens of TheBloke/Llama-2-7b-Chat-GGUF model .
L
G
9 comments
Share
Open in Discord
L
Logan M
last year
So, the thing about LLMs, is that the input and output are the same
Llama2 has a 4096 context window. Every token you specify in max_new_tokens is actually subtracting from the max input size.
So, in theory, the max is 4095, but that means only one token as input
G
Gauri
last year
okay it mean if we set the max_new_tokens = 4095 then model will take only 1 token per request as input ?
L
Logan M
last year
Yes
G
Gauri
last year
okay
G
Gauri
last year
Can you please explain the difference between context window and max_new_token ?
L
Logan M
last year
Context window is the max context size for the LLM
Max new tokens is how much of that context window should be reserved for output tokens
G
Gauri
last year
Okay ! got it
G
Gauri
last year
thank you π
G
Gauri
last year
now my doubts are clear
Add a reply
Sign up and join the conversation on Discord
Join on Discord