Query Engine's Handling of Maximum Context Window Limit...

At a glance

The community members discuss how a query_engine handles going over the maximum context window when using GPT-4 with an 8092 context window. They suggest that there are multiple LLM calls and a response synthesizer that refines the answer over these calls. The discussion also covers the use of a ChatEngine, which requires a vector base index, and the possibility of using a simpler agent or SimpleChatEngine if the context is passed in directly. However, it's noted that SimpleChatEngine may not handle going over the context window and instead relies on memory to filter and keep the conversation within limits.

cchantlong

Hello there. was wondering does query_engine handle going over the maximum context window?

like if you're using gpt-4 with a 8092 context window, and your nodes are over that limit, how does query engine handle that.

13 comments

WWhiteFang_Jr

I think there are multiple LLM calls in this case if I'm not wrong

LLogan M

Yea exactly. There is a response synthesizer that handles the context window, refining an answer over multiple llm calls

cchantlong

ahh right thanks!

cchantlong

And just to double confirm, Chat Engine requires a vector base index to be used right?

cchantlong

I can't just use Chat Engine like the ChatGPT UI.

cchantlong

B/c for this specific thing I'm doing I don't need vector data, just passed in context from the prompt.

LLogan M

a chat engine typically requires either a retriever or query engine

If you are just passin in all the context, you could use a simple agent, or SimpleChatEngine

cchantlong

great thanks!

cchantlong

I'm assuming SimpleChatEngine also handles the going over the context with refining with multiple llm calls

LLogan M

Hmmm, i don't think it does actually

LLogan M

Relies on the memory to filter out and keep the conversation within limits

LLogan M

Not to say it couldn't be updated I suppose

cchantlong

i see good to know!

Add a reply

Find answers from the community

Query Engine's Handling of Maximum Context Window Limits