@Logan M question - are there any

At a glance

The community member asked if there are any memory modules available that can be injected into bots, similar to RollingWindow Memory, and requested a notebook for reference. The comments suggest that the ChatMemoryBuffer from the llama_index.memory module can be used to configure a rolling window memory for chat engines. Community members discussed how to extract and save the memory externally, either by pickling or converting to JSON. They also explored the possibility of using a retriever instead of a chat engine and passing the memory as a parameter, though this was noted to be a more manual process. The community members provided code examples for using the ChatMemoryBuffer and RetrieverQueryEngine to manage the chat history. There is no explicitly marked answer in the comments.

aashishsha

@Logan M question - are there any memory modules availble that can be injected into bots ..something like RollingWindow Memory . If there is one - can you direct me to the notebook

10 comments

LLogan M

All our agents/chat engines by default using ChatMemoryBuffer which is essnetially a rolling window

You could also configure this manually

Plain Text

from llama_index.memory import ChatMemoryBuffer

memory = ChatMemoryBuffer.from_defaults(token_limit=3900)

chat_engine = index.as_chat_engine(
    chat_mode="condense_plus_context",
    memory=memory,
    verbose=False,
)

aashishsha

yeah -- I am trying to see how can I extract it and save it externally

aashishsha

is that possible

LLogan M

Since it's a pyndatic object, yes 👍

You should be able to either pickle it, or turn it into json and save/load it

SSMN

Is it possible, instead of using a chat engine, to use a retriever and then pass it the memory as a parameter when I call query? I hope I have explained myself. thanks

LLogan M

Not really? Like you could do that, but it would be a little manual. Anything is possible with a bit of elbow grease

LLogan M

Plain Text

retriever = index.as_retriever(similarity_top_k=2)
nodes = retriever.retrieve("query")

from llama_index import get_response_synthesizer
response_synthesizer = get_response_synthesizer()
response = response_synthesizer.synthesize("query", nodes)

SSMN

I am currently using a RetrieverQueryEngine with a custom prompt (no memory rn), where would you suggest I start/look

LLogan M

If you don't want to use a chat engine, you can use the above to either inject the chat history into the query, or modify the prompts to include chat history.

You can use the memory manually as well

Plain Text

from llama_index.memory import ChatMemoryBuffer
from llama_index.llms import ChatMessage

memory = ChatMemoryBuffer.from_defaults(token_limit=1500)
memory.put(ChatMessage(role="user", content="hello"))
chat_history = memory.get()

SSMN

thanks let me try . i'll try to understand how to inject the history first. thanks!

Add a reply

Find answers from the community

@Logan M question - are there any