Hello All!

ddennis

Hello All!

I think there could be a bug in the OpenAI agent memory.

I have made my own tool to query bigquery, please see below:

Plain Text

import typing
from dotenv import load_dotenv; load_dotenv()
from llama_index.agent import OpenAIAgent
from llama_index.llms import OpenAI, ChatMessage
from llama_index.tools import BaseTool, FunctionTool
from tools import BigqueryTool

llm = OpenAI(model="gpt-3.5-turbo-0613", temperature=0)
bigquery_tool_spec = BigqueryTool(llm=llm)
tools = bigquery_tool_spec.to_tool_list()

agent = OpenAIAgent.from_tools(
    tools=tools, 
    llm=llm,
    verbose=True, 
    system_prompt=AGENT_SYSTEM_PROMPT,
)

response = agent.chat("Can you first grab the schema?")
print(str(response))

The response is good, I have shortened it!

Plain Text

=== Calling Function ===
Calling function: get_schema with args: {}
Got output: [{"name": "hour", "type": "DATETIME", "mode": "NULLABLE"}, ...]
========================
The schema of the AirGrid platform is as follows:

1. hour: DATETIME
2. buyer_member_id: INTEGER
...

We can see the function returned the data needed and the LLM summarised it!

Now when I call: agent.all_messages

[ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='You are an agent tasked with helping analyse ....', additional_kwargs={})]

Only the inital system prompt set by AGENT_SYSTEM_PROMPT exists.

13 comments

LLogan M

~~what's the bug? 👀~~

ddennis

Sorry @Logan M I hit enter but was still mid explanation 🙂

ddennis

Important to mention - that in some cases chat history IS maintained - however on this specifc call, it is never maintained.

The response is rather large... so I was wondering if it is dropped due to length?

LLogan M

It could be dropped due to length. Try agent.chat_history or agent.memory.get_all() to get the full chat history

The memory cuts off at a certain token limit

ddennis

Ahhhh yes ok!

ddennis

Thanks very much @Logan M 🙂
I am looking at the source but if you know - does the full history get sent for the next calls?

ddennis

Or where does the logic live for that decision making?

ddennis

After fetching the schema, and asking:

Plain Text

response = agent.chat("Ok great, can you tell me just the type of the column called video_content_duration?")
print(str(response))

The agent makes another call to the get schema function.

LLogan M

When you call memory.get(), that ends up going here

https://github.com/run-llama/llama_index/blob/4edd1306b37c5fba1de4494cd9a66a5487fa009c/llama_index/memory/chat_memory_buffer.py#L85

ddennis

Cool, so I can make my own ChatMemoryBuffer and increase token limit 🙂

ddennis

👍

LLogan M

yup exactly, you can pass it in and set the limit manually 🙂

ddennis

Thanks again 🙂

Add a reply

Find answers from the community

Hello All!