Oh I think I might have seen this before, anthropic has this super awkward failure mode where it can run out of room writing the tool call, and the it just completely removes a bunch of stuff, creating a dummy tool call (this is all directly from anthropics API, its kind of terrible)
Try setting max_tokens
to a larger value
llm = Anthropic(..., max_tokens=1000)
upped it to 4096 and i still get these kinds of responses:
To do this, we'll use the "****" function, which retrieves the list of Data Source ids and names. Let me call this function
for you.
π€ Oh, maybe not the type of error I was thinking of
How are you using anthropic? Or what does your code/setup look like?
Does it work when not streaming?
I've been working on a stream_step impl for the FunctionCallingAgentWorker, not at my desk to make sure that's the class name π. Anyway, stream_chat_with_tools is seemingly where things are not working as expected. I've simply adapted the run_step with a Thread continuing on a ChatStreamResponse...
I can't remember tha class names π€£
Sorry, it works great when not streaming
using the vanilla FunctionCallingAgentWorker.chat instead of my overload StreamingFunctionCallingAgentWorker.stream_chat call
tbh, tho, im still trying to wrap my head around the new (ish) agent stuff
Hmmm, if i had to guess, the model isn't replying with a tool call first thing, so we assume there's no tool call and return a generator
very tough problem. Theres a ticket for this actually
ya reading through the bug... i wonder if the tool calls come through the stream even if not in the beginning? if so could we proxy the stream and capture tool calls?
yea thats most likely whats happening
And yea that seems like a decent solution, I think it will just require a lot of refactors to do that properly π
ya, been trying to approximate this solution in the FunctionCallingAgentWorker and its hacky at best
well, thank you for poking at this with me
i did confirm that tool calls are coming through midway in the stream (at least for anthropic), if its of any use
I think to fix this, the entire FunctionCallingAgentWorker.stream_step
needs to be a generator, rather than trying to stuff the generator into some background task/thread
The tricky part is we need to write to memory, detect tool calls so we know when to stop the worker, and expose a generator π
could you not iterate over the StreamingChatResponse, yield text responses when they come through, call tools when they come through? writing to memory as each change in nature (text/tool) occurs? This of course is after you refactor deps to have a generator returned from the stream_step π°