Hey @Logan M how are you? I am trying to use the workflow example to generate an example sub questions, then go through and use react to answer the subquestions form here:
https://docs.llamaindex.ai/en/stable/examples/workflow/sub_question_query_engine/The issue is when I get to the point in the subquestion routine:
agent = ReActAgent.from_tools(
await ctx.get("tools"), llm=llm_4o_2, verbose=False, max_iterations=5
)
response = agent.chat(ev.question)
There are some subquestion quries where it fails with :
Error code: 400 - {'error': {'message': "This model's maximum context length is 128000 tokens. However, your messages resulted in 129643 tokens. Please reduce the length of the messages.", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}
I really do not understand how to control this. BTW, the tools is a lit of retriever tools, that was supposed to have node_postprocessor reranker to titrate down the nodes. but i do keep hitting this error regardless.