Hello I am having trouble porting my

At a glance

Hello, I am having trouble porting my code to async. I have a chat engine initialized with streaming=True for which I now call aquery, this still returns StreamingResponse, which has the attribute response_gen: TokenGen, which is a synchronous generator. I noticed that in types.py there is also a TokenAsyncGen defined but I don't see a way that I can get that by using chat engine. Am I missing something in the library API, or is async streaming of the tokens not implemented yet and I have to use a thread to do this asynchronously?

6 comments

LLogan M

Yea so this is mostly our fault, the streaming responses from indexes do not have asynchronous generator support

However, we are about to change the default chat engine to be our OpenAI function agent, which does have async streaming

mmartinkozle

But I am using the chat engine with a locally hosted llm. Or is the OpenAI name just misleading me 😅

mmartinkozle

Oh sorry, I just noticed that I kept writing chat engine when I meant query engine. I don't know if this changes the answer

LLogan M

(The chat engine also has support for local llms, but then it uses a react agent (slightly less reliable)

Oh yea, if you are only using the query engine, there is no async generator (yet)

mmartinkozle

Okay then, I will solve it with a thread then. The current synchronous streaming is blocking my FastAPI end point to serve only one query at a time, and I was hoping that I can solve this with asynchronous streaming.

LLogan M

You could also wrap the response gen in an async function? And maybe need to use asyncio? Just guessing haha

Add a reply

Find answers from the community

Hello I am having trouble porting my