Find answers from the community

Updated 2 months ago

Agents

I am building an agent that is decomposing queries about about internal documentation. The agent works fine, and I am now looking at pushing this to production. All my inputs are going through a FastAPI Gunicorn instance, with Nginx in front as reverse proxy.
However I will have quite a few users and can anticipate that there will be simultaneous queries at the same time. What is the best practice to parallelize agents? Is gunicorn doing that by specifying the amount of workers?
L
1 comment
Yea I think that's the best way. Each thread/worker/request will need it's own instance of the agent though
Add a reply
Sign up and join the conversation on Discord