Find answers from the community

Updated 2 months ago

Streaming issue with simultaneous requests

Streaming issue for simultaneous request

I was checking this project and found this issue.

When I simultaneously asked two questions I found that the response was shared between the two requests.
Here I have added a screenshot of that.

Is this a LlamaIndex problem or is there any problem in implementing event streaming in this code?
How can I solve the response event sharing issue between two requests?
Attachment
streaming-issue.png
L
n
L
20 comments
Probably an issue with however this was implemented

It's totally possible to stream multiple requests 🀷
This project is a little old -- this isn't how I would implement streaming these days πŸ‘€
This is more recent from Rohan, using workflows, which is how I would do this

https://github.com/rsrohan99/llamaindex-workflow-streaming-tutorial

If you haven't used workflows yet, they are super cool. Docs here
https://docs.llamaindex.ai/en/stable/module_guides/workflow/
@Logan M thanks for your feedback.

For now, It would be great for me if I could solve that issue without workflow. I have tried with this doc. It didn't work for me because BaseEventHandler is thread-locked.
In that case, do you have any suggestions or doc that I can go through?
Thanks.
The instrumentation event handler would be the safe way to do it -- it's threads are and async safe
What was the exact issue?
Not sure what the issue with thread locking was πŸ€”
First I used Rohan's code, which did not work for two simultaneous requests because it globally shares the queue data.

Then I changed the code a bit. I initiate the CustomEventHandler object inside the request function, create a new queue for each user request, and collect the event from the CustomEventHandler object defined inside the request function.
But a new issue arrived. The llamaindex internal list event never got cleared. Now, I am getting old events, including new events, unless the server restarts.
In the attached screenshot you will notice that each event emits 5 times because I was asked 5 questions without restarting the server. Asking new questions will repeat each event for 6 times. That is my issue.

I have added the updated code file based on Rohan's implementation.
Attachment
Screenshot_from_2024-10-20_01-40-20.png
Also updating the package to 0.11.14 didn't help.
Hmm. Tbh it's probably going to be 100% less work to use a workflow lol
I might rewrite it later today if I get some free time
@nayan32biswas ^ converted to use workflows
Was a fun little exercise
Thank you @Logan M πŸ₯°
Hi @Logan M it seems like there is an issue with workflow event streaming when we are trying to use it with ContextChatEngine
this is the error
Attachment
Screenshot_2024-10-20_at_4.31.37_PM.png
That workflow itself IS a context chat engine fyi πŸ˜…

Anyways, yea since you aren't using an llm directly, the streaming changes slightly

Should be

Plain Text
response = await chat_engine.astream_chat(...)
async for delta in response.async_response_gen():
  ...


Where now you are iterating over purely the new chunks/deltas that are being streamed in
Thank you @Logan M
Add a reply
Sign up and join the conversation on Discord