Find answers from the community

Updated 2 months ago

Hi everyone,

Hi everyone,

I'm working on a RAG app using fastapi and llama-index and I'd like to know if any of you knows a way to handle to concurrent requests given that the chat engine is a shared global variable. Right now I'm using python Lock class but it seem to slow down the process.

I'm open to any suggestion, thanks in advance
L
t
5 comments
Don't make the chat engine a shared global? πŸ‘€ (And also, make sure you are using async chat calls as well)
IMO each user should have its own chat history

A shared global chat engine means all requests are sharing the same chat history
Thanks for your suggestion.
You mean that the chat engine should be instantiated for each incoming request ?
I see, thanks a lot
Add a reply
Sign up and join the conversation on Discord