Recurring issue with progress event and 429 errors in m...

At a glance

The community member is experiencing an issue with a multi-agent workflow where a service keeps returning a progress event repeatedly, leading to too many requests to OpenAI and 429 errors. The community member is trying to figure out how to get out of this endless loop. The comments suggest that the issue might be due to an unhandled exception in the loop, and that the community member should add logic to handle such cases. The comments also suggest using a more capable language model, such as GPT-4, for the orchestrator agent.

Useful resources

OOrion Pax

@Logan M for the multi agent workflow, I'm running into a recurring issue where a service returns a progress event over and over no matter what and then it starts returning 429s against OpenAI (meaning too many requests).

Is the retry set to infinite or something? I'm trying to figure out how to get this out of an endless loop.

7 comments

OOrion Pax

Here is my 'chat' endpoint in a fast api app:

Plain Text

workflow = ConciergeWorkflow(
            timeout=300
        )

        agent_configs = [MathAgentConfig, ManageCaseAgentConfig, SendAutoReplyAgentConfig]
        
        # we pass it to the workflow
        handler: WorkflowHandler = workflow.run(
            user_msg = request.message.content,
            llm = llm,
            chat_history = [],
            agent_configs = agent_configs,
        )

OOrion Pax

Plain Text

        # now we handle events coming back from the workflow
        async for event in handler.stream_events():
            logging.info(f"Got event: {event}")
            # if we get an InputRequiredEvent, that means the workflow needs human input
            # so we send an event to the frontend that will be handled specially                
            if isinstance(event, ToolRequestEvent):
                response = "Yes"

                # which we send back to the workflow as a HumanResponseEvent
                handler.ctx.send_event(
                    ToolApprovedEvent(
                        tool_id=event.tool_id,
                        tool_name=event.tool_name,
                        tool_kwargs=event.tool_kwargs,
                        approved=True
                    )
                )
            elif isinstance(event, InputRequiredEvent):
                logging.info("InputRequiredEvent")
                # we expect the next thing from the socket to be human input, always "Yes" for now
                response = "Yes"

                # which we send back to the workflow as a HumanResponseEvent
                handler.ctx.send_event(
                    HumanResponseEvent(response=response)
                )

            elif isinstance(event, ProgressEvent):
                # the workflow also emits progress events which we send to the frontend
                logging.info("ProgressEvent")
            elif isinstance(event, StopEvent):
                # the workflow also emits progress events which we send to the frontend
                logging.info("StopEvent")
            else:
                logging.info("Unknown Event")

        # once we've handled all the events, we await the final result
        logging.info("Awaiting final result")
        final_result = await handler
    except Exception as e:
        return JSONResponse(status_code=500, content={"message": str(e)})

    return final_result

OOrion Pax

I end up seeing repeated:

Plain Text

2024-11-16T20:17:07.4389005Z 2024-11-16 20:17:07,438 - Got event: msg='Agent is requesting a transfer. Please hold.'
2024-11-16T20:17:07.4397782Z 2024-11-16 20:17:07,438 - ProgressEvent
2024-11-16T20:17:07.8022613Z 2024-11-16 20:17:07,802 - HTTP Request: POST https://azure-openai-xxxxxx.openai.azure.com/openai/deployments/gpt-4o-mini/chat/completions?api-version=2023-03-15-preview "HTTP/1.1 200 OK"

LLogan M

Seems like it's constantly requesting a transfer and probably getting transfered back to the same agent that requested the transfer?

LLogan M

Theres no logic to handle an endless loop like that, you'd have to add it

LLogan M

4o-mini also isn't the smartest for agentic behavior id bet -- id use at least 4o for the orchestrator

OOrion Pax

Weirdly, if I pause the service and let the 429s resolve, the progressevents all disappear and it works fine. So I'm guessing it's in the loop because of some unhandled exception...

Add a reply

Find answers from the community

Recurring issue with progress event and 429 errors in multi-agent workflow