Query Engine's Handling of Maximum Context Window Limits
Query Engine's Handling of Maximum Context Window Limits
At a glance
The community members discuss how a query_engine handles going over the maximum context window when using GPT-4 with an 8092 context window. They suggest that there are multiple LLM calls and a response synthesizer that refines the answer over these calls. The discussion also covers the use of a ChatEngine, which requires a vector base index, and the possibility of using a simpler agent or SimpleChatEngine if the context is passed in directly. However, it's noted that SimpleChatEngine may not handle going over the context window and instead relies on memory to filter and keep the conversation within limits.