Loading multi doc agents from stored

At a glance

The community members are discussing the issue of loading multi-document agents from stored indices, which takes time for every query. They explore ways to optimize this process, such as:

- Caching the OpenAIAgentWorker in a global variable or using vector databases/remote storage to make the setup process faster.

- Storing the collation of QueryEngineTool objects, which represent the tools for each document agent, to avoid having to rebuild them for every query. However, the community members note that it is not possible to pickle the OpenAIAgent and its associated tools due to their dynamic and unpickleable nature.

The community members suggest using caching mechanisms like Flask-Caching or Redis Caching, as well as leveraging hosted vector database integrations like Weaviate, which can provide faster loading and indexing without having to load everything in memory.

Useful resources

SSimon 📐🛠

Loading multi doc agents from stored indices takes time for every query. Is there a way to pickle an OpenAIAgentWorker?
Tried to pickle

Plain Text

all tools

with the multi-doc agents v1 Notebook and was not able to. Is there a way to save this local object and load it when needed saving time loading every tool at query time?

9 comments

LLogan M

I think since the openai client has a thread lock in it, pickle is not possible

My suggestion would be to cache it in a global var, or use vector dbs/remote storage in your multi-agent setup so that setting it up is closer to a no-op

SSimon 📐🛠

How would i create a global var of an OpenAIAgentWorker?

LLogan M

Depends on what your code looks like

I think in this case actually, you would make the tools global (I think?)

Its just about scope right

Plain Text

tools = load_tools()

def chat(msg):
  agent = OpenAIAgent.from_tools(tools)
  return agent.chat(msg)

SSimon 📐🛠

I’m not sure I’m being clear. Per the Build Document Agent for each Document. https://docs.llamaindex.ai/en/stable/examples/agent/multi_document_agents-v1/#build-document-agent-for-each-document I want this loaded in storage so when a new query comes by the User doesn’t have to wait for the:
agents_dict, extra_info_dict = await build_agents(docs)

I’m trying to have these dictionaries already created for a prompt. Otherwise this build_agents takes 30 mins to load the vector and summary tools.

SSimon 📐🛠

Ideally, storing the collation of these tools is what I was wondering about which is found here https://docs.llamaindex.ai/en/stable/examples/agent/multi_document_agents-v1/#build-retriever-enabled-openai-agent

define tool for each document agent

all_tools = []
for file_base, agent in agents_dict.items():
summary = extra_info_dict[file_base]["summary"]
doc_tool = QueryEngineTool(
queryengine=agent, metadata=ToolMetadata( name=f"tool{file_base}",
description=summary,
),
)
all_tools.append(doc_tool)

all_tools is the array of Tools I want compiled as much as possible before a prompt is received. Given I have the vector and summary tools pickled in storage how can I “pickle” these QueryEngineTool’s. Is this possible? If not how can I get anywhere close to achieving this?
Agents are fantastic but the tutorials I have found don’t show ability to store these QueryEngineTools.

SSimon 📐🛠

I guess the problem is not storing the Tools, as you can do this with pickle, but it’s storing the OpenAIAgent’s and associated tools

LLogan M

You can't store tools, especially since they rely on components that are dynamic and/or unpickleable (like the open client itself)

In your app, it should be possible to set it up so that these tools are cached at runtime, or setup in a way so that it's a no-op to create them (like relying on remotely hosted vector db integrations, etc)

SSimon 📐🛠

Thanks for the quick reply. That makes sense now, I’m using Flask on Railway so maybe I use flask_caching or Redis Caching later on.
I don’t know much about the hosted vector db Integrations. Does Weaviate provide caching?

LLogan M

Any hosted vector db (like weaviate) will be much faster to "load" and index -- because you aren't loading anything in memory, just making an api connection

Add a reply

Find answers from the community

Loading multi doc agents from stored

define tool for each document agent