How to build a simple chat (no index)

At a glance

The post asks how to build a simple chat with a message history limit rather than a token limit. Community members discuss using the SimpleChatEngine from the Llama Index library, and try different approaches to limit the chat history to a fixed number of messages. However, they encounter issues where the history size exceeds the limit. The discussion explores the differences between chat_history and memory parameters, and whether it's possible to limit the number of messages rather than just the token count. There is no clear answer provided in the comments.

Useful resources

yyoelk

How to build a simple chat (no index) with a message history limit rather than token limit (i.e only the last K messages will be taken into account)?

16 comments

WWhiteFang_Jr

You can do something like this:

Plain Text

from llama_index.core.chat_engine import SimpleChatEngine

chat_engine = SimpleChatEngine.from_defaults(chat_history=ADD_HERE)

This is with SimpleChatEngine, you can choose other chat engine: https://docs.llamaindex.ai/en/stable/examples/chat_engine/chat_engine_personality/

yyoelk

Thanks @WhiteFang_Jr
So do I have to keep the history myself in a deque and reinitialize the chat_engine every time there's a new message?

yyoelk

I tried this code:

Plain Text

import os
from collections import deque
from llama_index.core.chat_engine import SimpleChatEngine
from llama_index.llms.openai import OpenAI

chat_history = deque(maxlen=4)
llm = OpenAI(model="gpt-4-turbo", temperature=0)

system_prompt = "You are a pro real estate agent"
chat_engine = SimpleChatEngine.from_defaults(chat_history=chat_history, system_prompt=system_prompt, llm=llm)

while True:
    user_input = input("User: ")
    response = chat_engine.chat(user_input)
    print("Bot:", response)
    print(f"History length: {len(chat_engine.chat_history)} \n")

However you can see that the size of the history exceeds 4, which seems to me like a bug.

yyoelk

@WhiteFang_Jr I edited my response, please see above. This seems to me like a bug
Edit 2: I guess I should have used "memory" instead of "chat_history". I now realize the difference 🙂

WWhiteFang_Jr

chat_history should be of type: Optional[List[ChatMessage]]

yyoelk

I also tried with memory, but I see that it doesn't limit the context to 4 messages.

WWhiteFang_Jr

Plain Text

chat_history = = [
                ChatMessage(content=system_prompt, role="system"),
                ChatMessage(content="user_msg", role="user")   
            ]

You don't need to create chat engine every turn, create it before while loop and then you can check if the chat_history len and then trim it and create chat_engine instance again

yyoelk

I don't understand. Why isn't it enough before the while loop to create the chat engine as follows:

Plain Text

memory = deque(maxlen=4)
llm = OpenAI(model="gpt-4-turbo", temperature=0)
chat_engine = SimpleChatEngine.from_defaults(memory = memory, system_prompt=system_prompt, llm=llm)

yyoelk

I would expect that memory within SimpleChatEngine will be initialized with a deque limited to 4 items. This deque can take any object, such as ChatMessage

LLogan M

The size of the memory is not limited by number of messages, because messages can be big and small

Instead, its limited by token count

Plain Text

memory = ChatMemoryBuffer.from_defaults(token_limit=3000)

yyoelk

Thanks @Logan M . Isn't there a way to use the memory parameter to limit the number of messages? Because I think it's very useful to limit messages (maybe in addition to tokens) rather than just tokens

yyoelk

Also, what's the meaning then of setting the memory as a deque?

Plain Text

memory = deque(maxlen=4)
chat_engine = SimpleChatEngine.from_defaults(memory=memory)

LLogan M

Why do you want to limit number of messages vs. token limits? Number of messages doesn't really make practical sense in my mind

LLogan M

If you want to make your own memory module, you can implement the base class

LLogan M

https://github.com/run-llama/llama_index/blob/main/llama-index-core/llama_index/core/memory/types.py

LLogan M

I'm not sure where you got this, but this doesn't work

Add a reply

Find answers from the community

How to build a simple chat (no index)