I want to have a OpenAI agent that functions as a chat.
I want to stream the output of my CitationQueryEngine if that tool is being used and then the output of the agent if that tool isn't being used.
So i want early stopping of the agent if it is the query engine tool where I will just stream the events of query engine manually using a queue. I was going to do this by stepping through the agent and just using the agent for query planning and then just running the stream on the tasks manually.
The agent ruins or the nice work of the citation query engine and more LLM passes without the RAG context drops accuracy. Stopping at the citation query engine stream is the best.
I like how you have split up task planning and execution with the step through. I am just struggling to see how I can identify the id of the tool identified for the task.