Find answers from the community

Updated last year

Hi I am attempting to leverage a

Hi, I am attempting to leverage a HuggingFaceEndpoint LLM, but am running into issue with the token count. I tried to set the paramater chunk_size in the service context, but it appears that this is not reducing the amount of tokens.I was wondering if there was anyway to reduce the token/ manage the amount of tokens used in a call? Any help would be appreciated.
L
D
M
6 comments
Which LLM are you using? Did you set context_window in the LLM definition?
I hear text generation from hugging face has been having bugs
llama-2-7b-chat-hf-8170
And thank you for the quick response. Apologies for the delay.
@Logan M - it seems like modifying the context_window helped, but this was via trial and error. Any idea why?
yea token counting is not perfect, so lowering the context window allows for more margin of error when counting tokens πŸ™‚
Add a reply
Sign up and join the conversation on Discord