Find answers from the community

Updated 4 months ago

Hi, I found that ChatMemoryBuffer doesn'

Hi, I found that ChatMemoryBuffer doesn't cut the list of messages to fit the token limit:

Plain Text

chatmemory = ChatMemoryBuffer.from_defaults(chat_history=chat_history, token_limit=history_limit)

Am I missing something? Is there a way to restrict the number of messages accordingly to the token_limit?

10 comments

LLogan M

I'm not sure what you mean, it should always cut off 😅

LLogan M

There's unit tests for it as well

SSeaCat

I thought the token_limit says this method to cut the list to fit the limit, no? After running the code above, I see the len of the chat memory store is the same as the original chat_history.

Plain Text

len(chatmemory.chat_store.store['chat_history'])

So, for my experiment I se history_limit just 200 tokens and passed a huge chat_history, about 4,500 messages, the length of the store didn't change

LLogan M

Seems to work for me

Plain Text

>>> from llama_index.core.memory import ChatMemoryBuffer
>>> from llama_index.core.llms import ChatMessage
>>> memory = ChatMemoryBuffer.from_defaults(token_limit=100)
>>> message = ChatMessage(role="user", content="a "*10)
>>> for _ in range(20):
...   memory.put(message)
... 
>>> len(memory.get())
9
>>> len(memory.get_all())
20
>>>

SSeaCat

Let me check, I guess I'm looking at the wrong things

SSeaCat

Yeah, it looks like it works, but it was not obvious to me because I had to look at the get() method not the store. Thanks!

LLogan M

ah I see, great! Yea, the get() and put() is what is usually used under the hood 💪

SSeaCat

One more question, though. I'm a bit puzzled about which order I should use when putting the messages into the memory. Should newer ones be the first or last? Thanks!

LLogan M

Newer ones go last 👍

SSeaCat

Gotcha, thanks!

Add a reply