Find answers from the community

Updated 7 months ago

Hi, I'm going crazy with the astream_

At a glance

The community member is having issues with the astream_chat implementation in their code. They are getting an error "coroutine' object has no attribute 'async_response_gen'". The community members provide suggestions, such as awaiting the chat_engine.astream_chat(query) call and using the async for loop correctly. However, the community member still encounters issues, such as "Error: 'async for' requires an object with aiter method, got method".

The community member eventually finds a solution by using run_in_threadpool and wrapping the generator in an async_wrap_generator function. This allows them to use a generator as an async generator and stream the response concurrently.

Hi, I'm going crazy with the astream_chat implementation,, this is my code ''' async def generate_response_qa(
query: str, chat_session_id: str, type: str, db
) -> AsyncGenerator[str, None]:
memory, message_history, chat_memory = await get_memory_for_session(chat_session_id, db)

chat_engine = SimpleChatEngine.from_defaults(memory=chat_memory, llm=QA_LLM_MODEL, prefix_messages=[])
response = chat_engine.astream_chat(query)

result = ""
async for token in response.async_response_gen:
result += token
yield token ''' . Each time I'm running a request I have Error: 'coroutine' object has no attribute 'async_response_gen' What I'm doing wrong here. BTW I'm using llama-index==0.10.51, Any help is welcome. Thank you.
L
k
19 comments
From the error, you never awaited
Plain Text
    response = await chat_engine.astream_chat(query)

    result = ""
    async for token in response.async_response_gen():
        result += token
        yield token 
that should work
Hi, I also tried it but gives "Error: 'async for' requires an object with aiter method, got method". 😦
Copy my example
you also missed the brackets
async for token in response.async_response_gen():
Plain Text
async def generate_response_qa(
    query: str, chat_session_id: str, type: str, db
) -> AsyncGenerator[str, None]:
    memory, message_history, chat_memory = await get_memory_for_session(chat_session_id, db)

    chat_engine = SimpleChatEngine.from_defaults(memory=chat_memory, llm=QA_LLM_MODEL, prefix_messages=[])
    response = await chat_engine.astream_chat(query)

    result = ""
    async for token in response.async_response_gen():
        result += token
        yield token
Works, but is not streaming. At this stage the only solution I see is to build a wrap around the Generator to have an AsyncGenerator.
I'm not sure what you mean but its not streaming? It works for me
this is an async generator no?
Yes, it is an async generator and I have a Server SIde Event (SSE). If I use stream_chat at the endpoint, UI, I see the content streamed, token by token, but with astream_chat I receive all the content in one time, not token by token. This is not doing any sense. If is working for you so the problem is somewhere in my code after this
Plain Text
python 
    async for token in response.async_response_gen():
        result += token
        yield token
. I understand the await at _await chat_engine.astreamchat(query) and the _async_responsegen() but in none example from Llamaindex documentation I see the await before chat_engine.astream_chat. Maybe I'm already too tired.
I think the issue is probably on the receiving side of this function
If I had to guess
It is a possibility and maybe an issue from Ollama 0.2.1. with async. As soon I will find the issue I will share it. Thanks.
Hi, didn't find the issue but I have the solution for my case, where I'm using FastAPI:

Plain Text
async def generate_response_qa(
    query: str, chat_session_id: str, type: str, db
) -> AsyncGenerator[str, None]:
    chat_memory = await get_memory_for_session(chat_session_id, db)

    chat_engine = SimpleChatEngine.from_defaults(memory=chat_memory, llm=QA_LLM_MODEL, prefix_messages=[])

    response = await run_in_threadpool(chat_engine.stream_chat, query)

    result = ""
    async for token in async_wrap_generator(response.response_gen):
        result += token
        yield token

where:
Plain Text
async def async_wrap_generator(sync_gen: Generator[Any, None, None]) -> AsyncGenerator[Any, None]:
    for value in sync_gen:
        await asyncio.sleep(0)  # Yield control to the event loop
        yield value

With this approach I can use a Generator as async and concurrently with _run_inthreadpool.
thats pretty jank lol but works for me
Sometimes desperation leads to desperate deployment. πŸ™‚ Thank you for your time.
Add a reply
Sign up and join the conversation on Discord