Find answers from the community

Updated 6 months ago

Has anyone else experienced non relevant

At a glance

Has anyone else experienced non relevant / too broad responses when using a ChatMemoryBuffer? If the chat memory is empty, a question to content that was indexed is correctly synthesized in the LLM response. However, if I first chat a little bit about general stuff that are not in the indexed documents, then ask my question to indexed content, the LLM response is worse and not relevant to my data sources.

2 comments

WWhiteFang_Jr

I think, the buffer keeps convo till upto the set limit and also your question first gets transformed based on the conversation and then it is passed to the query engine. So since you started off topic conversation, your updated query might have gotten degraded based on your conversation.

For example check this code: https://github.com/run-llama/llama_index/blob/3823389e3f91cab47b72e2cc2814826db9f98e32/llama-index-core/llama_index/core/chat_engine/condense_question.py#L177

bbenzen

I'm currently using the ChatEngine with chat_mode='context'. So should I be using chat mode "condense_plus_context"?

Add a reply