Find answers from the community

Updated 3 months ago

async with simple chat engine.

Has anyone here used async streaming (astream_chat()) apis?

Currently facing some issues. I have a bot where I dynamically decide to either use the simple chat engine or the condense question chat engine (basically if the user says thank you or any sych greeting, no point in running retrieval for that)


I have some custom code to handle the streaming. Problem is when I use
Plain Text
response = chat_engine.stream_chat(user_message)


both chat engines work fine but when I switch to
Plain Text
response = await chat_engine.astream_chat(user_message)


I do not get a response from simple chat engine (but the condense question chat engine works)

Trying to figure out what might be going wrong.


if anybody knows of any resources that talks about async streaming that would be helpful as well

cc
r
L
R
7 comments
seems like there is an issue. @Logan M is this something you are aware of?
Does it raise any error? I know astream_chat might cause issues

Normally the usage is something like

Plain Text
response = await agent.astream_chat("hello")

async for token in response.async_response_gen():
    print(token, end="")
ahh interesting. That makes sense. I will try tomorrow and update if this works. Are there any documentation around using async apis and async+streaming?

Will the above work for both chat engines?? (My main q is why does my code work for 1 chat engine and not the other, with the above code change I feel like it will work for simple but may break for condense_question).

UPDATE: Yep I changed
Plain Text
for message in response.response_gen

to
Plain Text
async for message in response.async_response_gen()

and
Plain Text
try:
  next(response.response_gen)
except StopIteration:

to
Plain Text
try:
            await response.async_response_gen().__anext__()
        except StopAsyncIteration:


and now SimpleChatEngine works buit the condense question one seems stuck
UPDATE2: Figured it out

Looks like Simple chat engine retruns object with achat_stream set while condense question returns with chat_stream set. This means that he apis for chat_engines arem't "interchangebale" and if one is mixing multiple chat engines right now the only way I could find was to use if/else to check whihc object is set and act accordingly
I don't know if this is the most elegant solution but I guess for me this works for now. Thanks @Logan M and @ravitheja
Hey @Logan M , I have another question regarding async. I have been noticing that if I use Claude, then the async request fails in the middle but wih gpt/anyscale it is working fine. The erros log is

Plain Text
WARNING::llama_index.chat_engine.types::Encountered exception writing response to history: <asyncio.locks.Event object at 0x13de7fe50 [unset]> is bound to a different event loop



also with anyscale while the response is completely retrieved., there is osmetimes the below stack trace
I have no idea my guy. Would have to look into it at some point
Add a reply
Sign up and join the conversation on Discord