I am running a multi-document agent, similar to
Multi Documents Agents. The difference: I use a local Redis Vector Storage, added chat memory and now i am using ReActAgents with a local llm with llama.cpp. My problem now is, that the top level agent always wants to use the provided tools and is not creative anymore. For example, when i only query "Hey my name is Paul" it wants to use all the tools.
The query then gives errors like:
Observation: Error: Could not parse output. Please follow the thought-action-input format. Try again.
The Observation before this error, seems also to long, because it interrupts in the middle of the sentence. Might this error be related to the context window? There Observation errors occur again and again until the maximum iteration limit is reached and i got no result.
How would you solve this problem?