Find answers from the community

Updated 2 weeks ago

Query Engine's Handling of Maximum Context Window Limits

Hello there. was wondering does query_engine handle going over the maximum context window?

like if you're using gpt-4 with a 8092 context window, and your nodes are over that limit, how does query engine handle that.
W
L
c
13 comments
I think there are multiple LLM calls in this case if I'm not wrong
Yea exactly. There is a response synthesizer that handles the context window, refining an answer over multiple llm calls
ahh right thanks!
And just to double confirm, Chat Engine requires a vector base index to be used right?
I can't just use Chat Engine like the ChatGPT UI.
B/c for this specific thing I'm doing I don't need vector data, just passed in context from the prompt.
a chat engine typically requires either a retriever or query engine

If you are just passin in all the context, you could use a simple agent, or SimpleChatEngine
I'm assuming SimpleChatEngine also handles the going over the context with refining with multiple llm calls
Hmmm, i don't think it does actually
Relies on the memory to filter out and keep the conversation within limits
Not to say it couldn't be updated I suppose
i see good to know!
Add a reply
Sign up and join the conversation on Discord