It doesn't really show the execution time, only shows some bonus logs. But I am doing the timing myself.
1.93 sec for the initial
agent_chat_response = self._get_agent_response(mode=mode, **llm_chat_kwargs)
Afterwards we have
=== Calling Function ===
Calling function: query_engine_tool with args: {
"input": "some question..."
}
Got output: ....
which took 7.155 secs
And finally 6.54 sec fot the agent_response with
current_func='auto'
which I assume is the GPT resposne itself am I correct?
I can't speed up that but the other half time is the index query which can be sped up
I also got the feeling as if things got worse after I updated to latest version of llama-index