Find answers from the community

Updated 9 months ago

`Encountered exception writing response

Encountered exception writing response to history: <asyncio.locks.Event object at 0x2b5558fd0 [unset]> is bound to a different event loop

Anyone ever seen this error?
m
L
e
32 comments
This error ONLY seems to happen when using SummaryIndex
I have no idea how to fix this one lol
You are probably using tree-summarize with the summary index right?
Try using aquery instead of query
Plain Text
agent_or_engine = create_summary_index_agent_from_s3_keys(
    model=model,
    user_id=user_id,
    content_s3_key_extension_pairs=content_s3_key_extension_pairs,
    history=history,
)
....
    summary_index = SummaryIndex.from_documents(
        documents, llm=LLM_INSTANCES[model], embed_model=get_embed_model()
    )
    retriever = get_retriever(
        user_id=user_id,
        model=model,
        index=summary_index,
    )
    chat_engine = CondensePlusContextChatEngine.from_defaults(
        retriever=retriever,
        llm=LLM_INSTANCES[model],
        chat_history=history,
        memory=ChatMemoryBuffer.from_defaults(  # pyright: ignore
            chat_history=history,
            token_limit=128000,  # When the memory parameter is omitted, the token limit is a small number and causes an error to be thrown.
        ),
    )

Plain Text
response = await agent_or_engine.astream_chat(message)  # pyright: ignore
I dont think im using tree sujmmarize
mayube instantiating the chat memory buffer is causing an issue
also, it starts streaming some content, but doesnt finish, halfway through it encounters the issue
also this ONLY happens with claude
uuuuuuu very sus
I really don't know how to debug this πŸ˜… Might need some google fu
I went deeper, it seems to be an issue directly related to StreamingAgentChatResponse
If i just use .chat it works
.stream_chat and .astream_chat BOTH break

I think the
astream_chat method of the Anthropic llm or the
response = await self._aclient.messages.create(
messages=anthropic_messages, system=system_prompt, stream=True, **all_kwargs
)

call are the issues, they are doing something that when the response or generator comes back its causing something ODD
I cant go deep into the anthropic client but if you have the time to do so, i think thats the issue
If you can reproduce with just llm.astream_chat(ChatMessage(role="user", content="Hello!")) or similar, we can probably open an issue on the anthropic github
alright ill look to do that as soon as i can
Seeing this on AzureOpenAI() models, with endpoints behind FastAPI, as well.
I think it's related to the async calls being made in the astream_chat endpoints. It's causing things to happen in another event_loop relative to the FastAPI event_loop. I'm still testing.

https://github.com/run-llama/llama_index/blob/main/llama-index-core/llama_index/core/chat_engine/condense_question.py#L362-L365
Whoops. wrong chat engine link
note the asyncio.run() call made within the Thread
I don't think this is a llamaindex problem, per se, but an integration snafu. I'll try and get some more details over the next couple hours πŸ™‚
right -- I wasn't sure how else to run an async method in a thread like this πŸ˜…
@edhenry Yes I had to switch over our whole entire application to the sync methods unfortunately as I have another chance to dive deeper into the async methods to fix them
Some prelim testing, changing this threading call: https://github.com/run-llama/llama_index/blob/main/llama-index-core/llama_index/core/chat_engine/condense_plus_context.py#L355-L359

to something like:

Plain Text
if self._memory:
    asyncio.create_task(
        chat_response.awrite_response_to_history(self._memory)
    )
    chat_response._ensure_async_setup()
    await chat_response._is_function_false_event.wait()


Seems like it might fix it, but I'm still learning about asyncio, myself. Thoughts @Logan M ?
Will that allow you to still stream the response while its writing to history? Hard to say without trying it I suppose haha, but I would confirm that
a) streaming still works
b) the chat_history is updated properly when the streaming is complete
if so to both, then it seems like an acceptable fix πŸ’ͺ
a) βœ…
b) βœ…
I'll get a PR raised for this soon πŸ™‚
Awesome! Thanks for ton for debugging this and trying it out -- I'll be trying this out myself when the PR is open πŸ™‚
Add a reply
Sign up and join the conversation on Discord