Retrieving Chunks as ChatResponse Objects from an Agent

Question

Hello there!I'm experiencing an issue - i want to retrieve chunks in form of ChatResponse objects from an agent.i did a following:response_generator = self.agent.stream_chat(message=messages[-1].content, chat_history=messages[:-1]).chat_stream for token in response_generator: yield tokenbut i'm getting:ValueError: generator already executingwhen using response_gen instead of chat_stream it's working flawless. However, i truly need that ChatResponse objects

bszaniecki · Answer

Maybe it would be useful if chunks were put on queue as a whole, not just the delta :c

Logan M · Answer

I think the syntax is wrong here?It should beresp = agent.stream_chat(...)
for r in resp.response_gen: print(r.delta, end="", flush=True)

bszaniecki · Answer

The thing is i want to directly access ChatCompletionChunk objects, not the strings returned from response_gen property. That's why i have tried to read chat_stream directly

bszaniecki · Answer

However, I've already noticed that AgentWorker is running a tread

Plain Text

            thread = Thread(
                target=agent_response_stream.write_response_to_history,
                args=(task.extra_state["new_memory"],),
                kwargs={"on_stream_end_fn": partial(self.finalize_task, task)},
            )
            thread.start()

in which agent_response_stream.write_response_to_history is already consuming chat_stream generator

bszaniecki · Answer

What's more, it's not even possible to directly retrieve ChatCompletionChunk objects from the queue, as only the delta is put there: def write_response_to_history( self, memory: BaseMemory, on_stream_end_fn: Optional[Callable] = None, ) -> None: if self.chat_stream is None: raise ValueError( "chat_stream is None. Cannot write to history without chat_stream." ) # try/except to prevent hanging on error dispatcher.event(StreamChatStartEvent()) try: final_text = "" for chat in self.chat_stream: self.is_function = is_function(chat.message) if chat.delta: dispatcher.event( StreamChatDeltaReceivedEvent( delta=chat.delta, ) ) self.put_in_queue(chat.delta)
[...]

bszaniecki · Answer

@Logan M is there any reason why whole chunks cannot be put on the queue? It might be handy while being easy to implement at first glance

Logan M · Answer

Tbh, i would use the newer AgentWorkflow -- it exposes way more over it's streaming api, and is arguably better engineered overall (eventually these other agent classes will be deprecated anyways)

Logan M · Answer

https://docs.llamaindex.ai/en/stable/examples/agent/agent_workflow_basic/

Logan M · Answer

It also supports multi agent systems natively https://docs.llamaindex.ai/en/stable/understanding/agent/multi_agents/

Find answers from the community

Retrieving Chunks as ChatResponse Objects from an Agent