Hello, a question on using an agent and

At a glance

The community member posted a question about an exception they encountered when using an agent and functions. The exception indicated that the maximum context length of 8192 tokens was exceeded, with 3194 tokens in the messages, 1047 tokens in the functions, and 4000 tokens in the completion. The community members discussed how to calculate the context limit ahead of time and avoid such errors.

The comments suggest that it is difficult to calculate the token usage ahead of time, as the token count for the functions is not easily accessible. Some community members recommended leaving the max_tokens parameter unset, as this would allow the language model to determine the appropriate length of the output. However, others expressed concern that this could lead to the context window being exceeded. The community members discussed various strategies for limiting the context, such as cutting the chat history to leave room for the completion, but there was no explicitly marked answer.

SSeaCat

Hello, a question on using an agent and functions. We just saw this exception

Plain Text

Error code: 400 - {'error': {'message': "This model's maximum context length is 8192 tokens. However, you requested 8241 tokens (3194 in the messages, 1047 in the functions, and 4000 in the completion). Please reduce the length of the messages, functions, or completion.", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}

So, what are 1047 tokens in the functions, and how can we calculate ahead and strict, say, the ouput, to avoid such an error? Thanks!

14 comments

LLogan M

Pretty hard to calculate this ahead of time tbh -- this means all the tool names and descriptions and associated schema took up 1047 tokens

Did you expclitly set max_tokens? I would leave that unset (seems like you had it set to 4000)

SSeaCat

yes, I pass it to OpenAI like:

Plain Text

OpenAI(temperature=0, model=model_name, api_key=ai_key, max_tokens=4000)

SSeaCat

Is it okay to leave it unset? But the potential problem is if I don't pass this limit, it may be much longer and therefore, the whole context window will be blown up. And again, if I don't know the output limit, how can I restrict the context limit? Equation with multiple variables is unsolvable 😦

LLogan M

If its unset, its unlimited (until the LLM decides to stop writing or it hits the maximum length). Much safer it leave it unset imo

SSeaCat

Yes, but I need to calculate the context limit somehow. If I don't know the output limit, how can I limit the context?

LLogan M

You can leave it unset on the llm, but still limit the context

LLogan M

avoids errors like your original, still leaves plenty of room for the llm

SSeaCat

Say, my model has limit 8,192 and the conversation is pretty long, so I have to cut it, right? Even if I don't pass 4,000 or whatever to LLM, how can I cut the chat history if I don't know which limit I have for it?

LLogan M

You can cut it leaving room for 4000 then

SSeaCat

Yeah, I thought the same... just wondered, may be you had a better idea. Thanks!

SSeaCat

I have one more question, if you don't mind: How can I know how many tokes were spent on messages, functions, and completions, other than seeing in the exception description? Where can I find this breakdown? Thanks!

Find answers from the community

Hello, a question on using an agent and