Concurrent execution of openai agent function in asynch...

At a glance

The community members are discussing the concurrent execution of functions in an asynchronous environment using the OpenAIAgent from the llama_index library. They experiment with different approaches, including using asyncio.gather and the FunctionCallingAgent. The community members find that the OpenAIAgent does not support concurrent function calls by default, but the FunctionCallingAgent can be used to achieve this. They also discuss the differences between the two agents and the potential for adding concurrent function calling to the OpenAIAgent. Eventually, one of the community members creates a pull request to add this functionality to the OpenAIAgent.

Useful resources

rrice

Hello, is it possible for OpenAIAgent function execution to be concurrent, especially in asynchronous environment? Note that this is different from parallel function calling. What I am looking for is the way to execute those parallel func calls concurrently. Something like asyncio gather. Thanks!!

Relevant docs: https://docs.llamaindex.ai/en/stable/examples/agent/openai_agent_parallel_function_calling/#example-from-openai-docs

I tried to put asyncio.sleep inside get_current_weather (modified to be async) to confirm that in fact the function executions are not concurrent.

17 comments

LLogan M

If you use async entry points (and async tools), you should be ok?

Plain Text

async def my_tool(...) -> str:
  """Some docstring"""
  await asyncio.sleep(1)
  return "Work done"

tool = FunctionTool.from_defaults(async_fn=my_tool)

# could also use FunctionCallingAgent, same thing, more generic
agent = OpenAIAgent.from_tools([tool], ...)

resp = await agent.achat(...)

rrice

In my experiment, the tools are not called concurrently. Here is the full reproducible code

rrice

Plain Text

import asyncio
import json
import time

from llama_index.agent.openai import OpenAIAgent
from llama_index.core.tools import FunctionTool
from llama_index.llms.openai import OpenAI

from app.config import OPENAI_API_KEY


async def aget_current_weather(location: str):
    """Get the current weather in a given location"""
    await asyncio.sleep(2)
    # NOTE: this will be printed sequentially
    print(f"sleeping for 2 secs, current time: {time.time()}")
    return json.dumps({"location": location, "temperature": "22", "unit": "celsius"})


async def main():
    aweather_tool = FunctionTool.from_defaults(async_fn=aget_current_weather)
    llm = OpenAI(model="gpt-4o-mini", api_key=OPENAI_API_KEY)
    aagent = OpenAIAgent.from_tools([aweather_tool], llm=llm)
    response = await aagent.achat(
        "What's the weather like in San Francisco, Tokyo, and Paris?",
    )
    print(response)


if __name__ == "__main__":
    asyncio.run(main())

rrice

here is the terminal output

Plain Text

sleeping for 2 secs, current time: 1734580081.053151
sleeping for 2 secs, current time: 1734580083.055695
sleeping for 2 secs, current time: 1734580085.0581667
The current weather in the three cities is as follows:

- **San Francisco**: 22°C
- **Tokyo**: 22°C
- **Paris**: 22°C

All three cities are experiencing the same temperature!

As you can see, all the printed times differ by 2 secs.

LLogan M

@rice thx for the code, very helpful.

Yea i guess openai agent was never updated to do asyncio.gather on multiple tool calls. If you swap with FunctionCallingAgent, it works as expected

Plain Text

from llama_index.core.agent import FunctionCallingAgent

...

async def main():
    aweather_tool = FunctionTool.from_defaults(async_fn=aget_current_weather)
    llm = OpenAI(model="gpt-4o-mini", api_key=OPENAI_API_KEY)
    aagent = FunctionCallingAgent.from_tools([aweather_tool], llm=llm)
    response = await aagent.achat(
        "What's the weather like in San Francisco, Tokyo, and Paris?",
    )
    print(response)

rrice

Thanks for the code, it works concurrently now. But i have more questions. What does FunctionCallingAgent do differently vs OpenAIAgent (other than the former one does not support astream_chat)?

LLogan M

FunctionCallingAgent is more generic. It works with LLMs that implement the FunctionCallingLLM base class (i.e. llms that support tool calling in their api)

This way, we don't need a GeminiAgent and AnthropicAgent, they can all share the same, because the tool calling logic is in their llm class

As for what it does differently, it's basically the exact same logic, but using the code from the FunctionCallingLLM class

rrice

I see now. Can you point me to the documentation about FunctionCallingAgent? Well, at this point I really need for concurrent function calling and astream_chat at the same time lol. I might dig deeper to llamaindex source code later. Now, my options are:

Use OpenAIAgent without concurrent calls
Use FunctionCallingAgent without astream_chat
Use raw openai client (this is one a bit of a pain to setup)

LLogan M

The methods and interfaces are the exact same as openai. But yea, haven't quite got streaming there

It should be an easy PR to add asyncio.gather to the openai agent tool calls though

LLogan M

^ option 4

rrice

interesting idea, i'll look around first

LLogan M

actually jk, just added it to my pr
https://github.com/run-llama/llama_index/pull/17320

LLogan M

It'll be published in about 5 mins or so automatically

LLogan M

pip install -U llama-index-agent-openai will get the new version

LLogan M

ez pz