Find answers from the community

s
F
Y
a
P
Updated 6 months ago

How to build a simple chat (no index)

How to build a simple chat (no index) with a message history limit rather than token limit (i.e only the last K messages will be taken into account)?
W
y
L
16 comments
You can do something like this:

Plain Text
from llama_index.core.chat_engine import SimpleChatEngine

chat_engine = SimpleChatEngine.from_defaults(chat_history=ADD_HERE)


This is with SimpleChatEngine, you can choose other chat engine: https://docs.llamaindex.ai/en/stable/examples/chat_engine/chat_engine_personality/
Thanks @WhiteFang_Jr
So do I have to keep the history myself in a deque and reinitialize the chat_engine every time there's a new message?
I tried this code:

Plain Text
import os
from collections import deque
from llama_index.core.chat_engine import SimpleChatEngine
from llama_index.llms.openai import OpenAI

chat_history = deque(maxlen=4)
llm = OpenAI(model="gpt-4-turbo", temperature=0)

system_prompt = "You are a pro real estate agent"
chat_engine = SimpleChatEngine.from_defaults(chat_history=chat_history, system_prompt=system_prompt, llm=llm)

while True:
    user_input = input("User: ")
    response = chat_engine.chat(user_input)
    print("Bot:", response)
    print(f"History length: {len(chat_engine.chat_history)} \n")


However you can see that the size of the history exceeds 4, which seems to me like a bug.
@WhiteFang_Jr I edited my response, please see above. This seems to me like a bug
Edit 2: I guess I should have used "memory" instead of "chat_history". I now realize the difference πŸ™‚
chat_history should be of type: Optional[List[ChatMessage]]
I also tried with memory, but I see that it doesn't limit the context to 4 messages.
Plain Text
chat_history = = [
                ChatMessage(content=system_prompt, role="system"),
                ChatMessage(content="user_msg", role="user")   
            ]

You don't need to create chat engine every turn, create it before while loop and then you can check if the chat_history len and then trim it and create chat_engine instance again
I don't understand. Why isn't it enough before the while loop to create the chat engine as follows:

Plain Text
memory = deque(maxlen=4)
llm = OpenAI(model="gpt-4-turbo", temperature=0)
chat_engine = SimpleChatEngine.from_defaults(memory = memory, system_prompt=system_prompt, llm=llm)
I would expect that memory within SimpleChatEngine will be initialized with a deque limited to 4 items. This deque can take any object, such as ChatMessage
The size of the memory is not limited by number of messages, because messages can be big and small

Instead, its limited by token count

Plain Text
memory = ChatMemoryBuffer.from_defaults(token_limit=3000)
Thanks @Logan M . Isn't there a way to use the memory parameter to limit the number of messages? Because I think it's very useful to limit messages (maybe in addition to tokens) rather than just tokens
Also, what's the meaning then of setting the memory as a deque?

Plain Text
memory = deque(maxlen=4)
chat_engine = SimpleChatEngine.from_defaults(memory=memory)
Why do you want to limit number of messages vs. token limits? Number of messages doesn't really make practical sense in my mind
If you want to make your own memory module, you can implement the base class
I'm not sure where you got this, but this doesn't work
Add a reply
Sign up and join the conversation on Discord