Handling Huge Data in Llama Index

Question

Hi, I have a FunctionAgent that has a few tools. One of them is an MCP Server that converts markdown to PDF.
This MCP Server tool returns Image(data=pdf_data, format="pdf").
While generating a report I am getting the error:
""""
venv/lib/python3.12/site-packages/llama_index/core/workflow/workflow.py", line 310, in _task
raise WorkflowRuntimeError(
llama_index.core.workflow.errors.WorkflowRuntimeError: Error in step 'run_agent_step': Timeout on reading data from socket
""""

I assume it's related to the huge content of the pdf_data that I noticed—at least 50k tokens.

What is the proper way to deal with that kind of huge data? Is there something built into the framework that is recommended?

Logan M · Answer

If you look at the full traceback (there's actually two nested tracebacks), you should see where its timing out (it sounds like its timing out on the MCP client?)

If thats the case, you should be able to increase the timeout?

GB7 · Answer

It was originated in bedrock_converse client. I tried to increase the timeout but it didn't help.

GB7 · Answer

Traceback here

GB7 · Answer

@Logan M
I have a more general question:
Let's say you have a few FunctionAgents running with AgentWorkflow one of them is the reporting agent that produces a lot of data.

Should we pass the final content of the report to other agents or store it somewhere (Context/else) and just let the next agent know that the task of creating a report is done?

Logan M · Answer

Depends on how you want to use it I guess -- do you want the agents to all see the data, or do you want to selectively access the report by other tools?

GB7 · Answer

yes, selectively access the report by other tools

Logan M · Answer

Probably I would just throw it into the context and access it as needed then 👍 This is fairly easy since tools can take in ctx as an input now

GB7 · Answer

In that case, would the LLM be aware of the context it passed to the tool? In other words - does the ctx object affect the llm context window when the llm passes it to one of its tools

Logan M · Answer

the ctx is completely separate from the context window of the LLM. Its not aware of anything in the ctx except for the things in the ctx.get("state") dict

GB7 · Answer

Thanks a lot

Find answers from the community

Handling Huge Data in Llama Index