Find answers from the community

Updated 11 months ago

Is there a way to include memory buffer

Is there a way to include memory buffer along with hybrid retriever with Chat Engine - Condense Plus Context Chat
L
a
17 comments
if you attach it to a chat engine or agent, yes
ok, can u share some indicative code, that will be very helpful
Plain Text
from llama_index import CondensePlusContextChatEngine

chat_engine = CondensePlusContextChatEngine.from_defaults(retriever, memory=memory)


If you don't provide the memory, it just automatically defaults to a chat memory buffer
when i use this..i hit context limit exceeded error..is there a way to refresh the memory buffer
you can set a token_limit on the memory, but you also need to be careful about what the top-k is on your retriever (too much context will also cause issues)
actually it keeps on adding previous answers which leads to token limit exceeded..with regular query_engine this error never occurs
Right. Keeping previous answers is what a chat engine is for 😅 it's part of the chat.

Like I said, you can set the token limit, so that the memory only remembers the last X messages that fit into the token limit
hmm, the only challenge i see with token_limit setting is that it may abruptly truncate a chat message
It won't truncate a message. It will remove the entire message instead
memory=ChatMemoryBuffer.from_defaults(token_limit=4096)
if st.session_state.messages[-1]["role"] != "assistant":
with st.chat_message("assistant"):
with st.spinner("Thinking..."):
llm = OpenAI(model="gpt-3.5-turbo")
service_context = ServiceContext.from_defaults(llm=llm)
query_engine=RetrieverQueryEngine.from_args(retriever=hybrid_retriever,service_context=service_context)
chat_engine=CondensePlusContextChatEngine.from_defaults(query_engine,memory=memory,system_prompt=context_prompt)
response , chat_messages = chat_engine.chat(str(prompt))
if "not mentioned in" in response.response or "I don't know" in re…

Hi Logan, need some clarification on above
what all does memory variable store, previous QA or QA + context retrieved for all previous QA
Just previous QA, the underlying retrieved context is not saved (it would be way too large)
got it, thanks for confirming my understanding
so the token limit of 4096 in memory=ChatMemoryBuffer.from_defaults(token_limit=4096) will be the limit to make sure previous QA do not exceed 4096 tokens
Add a reply
Sign up and join the conversation on Discord