Context chat engine works by doing retrieval i.e. on every user message, it runs index.retrieve() and puts the retrieved text into the system prompt. Then, along with the chat history, the LLM responds
An agent will look at the chat history + list of tools, decide if it needs to invoke a tool, and then interpret that tool response to decide if it needs to run another tool or return to the user
Oh I see. So context chat: retrieval => context + chat history => llm agent: chat history + tools => select a tool (like a sub-question query engine as a tool) => return