The community member is building an agent that uses a large language model (LLM) and wants to stream the final output. They mention the option of passing the full final message to a final step and streaming that, but this would result in a latency hit. They haven't found a nice solution yet, as the full message is required to determine if a function call is needed.
Another community member suggests using an async generator in a workflow to first return a boolean to determine if it's a tool call, and then return the stream. They provide a link to a Colab notebook demonstrating this approach.
The original community member says they are building the agent mostly from scratch using Workflows and had a similar idea, so they will take a look at the notebook. The other community member says the approach in the notebook seemed to work pretty well.
The community members also discuss other potential solutions, such as creating a "Final Answer" tool that requires only a boolean to limit output tokens, and using the new event streaming API from LlamaIndex.
There is no explicitly marked answer, but the community members are collaborating and sharing ideas to find a solution to the original problem.
Hey all, anyone have an example of building an agent using a function calling llm where they stream the final output? There is the option of passing the full final message to a final step and streaming that but you'll get a latency hit; I haven't found a nice solution yet as the full message is required to determine if a function call is required.
I was also thinking of creating a "Final Answer" tool that required only a boolean to limit output tokens and then passing on the final message to a final step if that tool was called
Just curious, but are you building this as part of where you work? Workflows are new, so always curious about the usecases and business cases people are working on with them π₯
Yeah I actually had a discussion with Biswaroop recently about the use cases and was actually going to ping him again for a follow up call, i'll mention you be included as well if interested