Find answers from the community

Updated last year

Hello I am having trouble porting my

Hello, I am having trouble porting my code to async. I have a chat engine initialized with streaming=True for which I now call aquery, this still returns StreamingResponse, which has the attribute response_gen: TokenGen, which is a synchronous generator. I noticed that in types.py there is also a TokenAsyncGen defined but I don't see a way that I can get that by using chat engine. Am I missing something in the library API, or is async streaming of the tokens not implemented yet and I have to use a thread to do this asynchronously?
L
m
6 comments
Yea so this is mostly our fault, the streaming responses from indexes do not have asynchronous generator support

However, we are about to change the default chat engine to be our OpenAI function agent, which does have async streaming
But I am using the chat engine with a locally hosted llm. Or is the OpenAI name just misleading me πŸ˜…
Oh sorry, I just noticed that I kept writing chat engine when I meant query engine. I don't know if this changes the answer
(The chat engine also has support for local llms, but then it uses a react agent (slightly less reliable)

Oh yea, if you are only using the query engine, there is no async generator (yet)
Okay then, I will solve it with a thread then. The current synchronous streaming is blocking my FastAPI end point to serve only one query at a time, and I was hoping that I can solve this with asynchronous streaming.
You could also wrap the response gen in an async function? And maybe need to use asyncio? Just guessing haha
Add a reply
Sign up and join the conversation on Discord