Find answers from the community

Updated last year

Hi I am attempting to leverage a

Hi, I am attempting to leverage a HuggingFaceEndpoint LLM, but am running into issue with the token count. I tried to set the paramater chunk_size in the service context, but it appears that this is not reducing the amount of tokens.I was wondering if there was anyway to reduce the token/ manage the amount of tokens used in a call? Any help would be appreciated.

6 comments

LLogan M

Which LLM are you using? Did you set context_window in the LLM definition?

DDangFutures

I hear text generation from hugging face has been having bugs

MMJThomas

llama-2-7b-chat-hf-8170

MMJThomas

And thank you for the quick response. Apologies for the delay.

MMJThomas

@Logan M - it seems like modifying the context_window helped, but this was via trial and error. Any idea why?

LLogan M

yea token counting is not perfect, so lowering the context window allows for more margin of error when counting tokens 🙂

Add a reply