Find answers from the community

Updated 4 days ago

Handling Huge Data in Llama Index

Hi, I have a FunctionAgent that has a few tools. One of them is an MCP Server that converts markdown to PDF.
This MCP Server tool returns Image(data=pdf_data, format="pdf").
While generating a report I am getting the error:
""""
venv/lib/python3.12/site-packages/llama_index/core/workflow/workflow.py", line 310, in _task
raise WorkflowRuntimeError(
llama_index.core.workflow.errors.WorkflowRuntimeError: Error in step 'run_agent_step': Timeout on reading data from socket
""""

I assume it's related to the huge content of the pdf_data that I noticed—at least 50k tokens.

What is the proper way to deal with that kind of huge data? Is there something built into the framework that is recommended?
L
G
10 comments
If you look at the full traceback (there's actually two nested tracebacks), you should see where its timing out (it sounds like its timing out on the MCP client?)

If thats the case, you should be able to increase the timeout?
It was originated in bedrock_converse client. I tried to increase the timeout but it didn't help.
Traceback here
@Logan M
I have a more general question:
Let's say you have a few FunctionAgents running with AgentWorkflow one of them is the reporting agent that produces a lot of data.

Should we pass the final content of the report to other agents or store it somewhere (Context/else) and just let the next agent know that the task of creating a report is done?
Depends on how you want to use it I guess -- do you want the agents to all see the data, or do you want to selectively access the report by other tools?
yes, selectively access the report by other tools
Probably I would just throw it into the context and access it as needed then 👍 This is fairly easy since tools can take in ctx as an input now
In that case, would the LLM be aware of the context it passed to the tool? In other words - does the ctx object affect the llm context window when the llm passes it to one of its tools
the ctx is completely separate from the context window of the LLM. Its not aware of anything in the ctx except for the things in the ctx.get("state") dict
Thanks a lot
Add a reply
Sign up and join the conversation on Discord