Find answers from the community

Updated last year

How to configure external chat memory to chat engine

@kapa.ai How to configure external chat memory to chat engine ?
k
A
7 comments
Let me check the knowledge base for you. One moment! :books:
----------------------
I'm sorry, but I can't provide an answer without a question. Could you please ask a question about LlamaIndex?
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
Let me check the knowledge base for you. One moment! :books:
----------------------
To configure the chat history in LlamaIndex, you can use the ChatMemoryBuffer class from the llama_index.memory module. This class allows you to set a token limit for the chat history. Here's an example of how to do it:

Plain Text
from llama_index.memory import ChatMemoryBuffer

memory = ChatMemoryBuffer.from_defaults(token_limit=1500)


In this example, the token limit for the chat history is set to 1500. This means that the chat history will only retain the last 1500 tokens of conversation.

After setting up the memory buffer, you can use it to configure a chat engine:

Plain Text
chat_engine = index.as_chat_engine(
    chat_mode="context",
    memory=memory,
    system_prompt=(
        "You are a chatbot, able to have normal interactions, as well as talk"
        " about an essay discussing Paul Grahams life."
    ),
)


In this example, the chat engine is configured to use the memory buffer we just set up. The chat_mode is set to "context", which means the chat engine will use the context from the chat history when generating responses.

You can find more details in the LlamaIndex documentation ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
@kapa.ai what type of memory we can use for chat_engine ?
Let me check the knowledge base for you. One moment! :books:
----------------------
In LlamaIndex, you can use the ChatMemoryBuffer class from the llama_index.memory module for the chat engine. This class provides a memory buffer that retains a certain amount of chat history based on a token limit.

Here's an example of how to use it:

Plain Text
from llama_index.memory import ChatMemoryBuffer

memory = ChatMemoryBuffer.from_defaults(token_limit=1500)


In this example, the token limit for the chat history is set to 1500. This means that the chat history will only retain the last 1500 tokens of conversation.

Additionally, when integrating LlamaIndex with Langchain, you can use the GPTIndexChatMemory class from the llama_index.langchain_helpers.memory_wrapper module. This class wraps the LlamaIndex to be used as a memory module in Langchain.

Here's an example of how to use it:

Plain Text
from llama_index import SummaryIndex
from llama_index.langchain_helpers.memory_wrapper import GPTIndexChatMemory

index = SummaryIndex([])
memory = GPTIndexChatMemory(
    index=index,
    memory_key="chat_history",
    query_kwargs={"response_mode": "compact"},
    return_source=True,
    return_messages=True,
)


In this example, the GPTIndexChatMemory is configured to use a SummaryIndex as the underlying index. The memory_key is set to "chat_history", which means the chat history will be stored under this key in the memory.

You can find more details in the LlamaIndex documentation () and the LlamaIndex Langchain Integrations notebook ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
Add a reply
Sign up and join the conversation on Discord