----------------------
In LlamaIndex, you can use the
ChatMemoryBuffer
class from the
llama_index.memory
module for the chat engine. This class provides a memory buffer that retains a certain amount of chat history based on a token limit.
Here's an example of how to use it:
from llama_index.memory import ChatMemoryBuffer
memory = ChatMemoryBuffer.from_defaults(token_limit=1500)
In this example, the token limit for the chat history is set to 1500. This means that the chat history will only retain the last 1500 tokens of conversation.
Additionally, when integrating LlamaIndex with Langchain, you can use the
GPTIndexChatMemory
class from the
llama_index.langchain_helpers.memory_wrapper
module. This class wraps the LlamaIndex to be used as a memory module in Langchain.
Here's an example of how to use it:
from llama_index import SummaryIndex
from llama_index.langchain_helpers.memory_wrapper import GPTIndexChatMemory
index = SummaryIndex([])
memory = GPTIndexChatMemory(
index=index,
memory_key="chat_history",
query_kwargs={"response_mode": "compact"},
return_source=True,
return_messages=True,
)
In this example, the
GPTIndexChatMemory
is configured to use a
SummaryIndex
as the underlying index. The
memory_key
is set to "chat_history", which means the chat history will be stored under this key in the memory.
You can find more details in the
LlamaIndex documentation () and the
LlamaIndex Langchain Integrations notebook ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------