Hi there! I recently updated llama-index

LLautaro

Hi there! I recently updated llama-index to the current stable version. I use Azure OpenAI models and in our project when the moderation models were triggered, we would catch the BadRequestError re-raised from llama to identify the category flagged and apply some business logic.
Now with the update, when raised the BadRequestError, another one is raised afterwards:

Plain Text

File ".../python3.10/site-packages/llama_index/core/callbacks/token_counting.py", line 91, in get_llm_token_counts
    raise ValueError(
ValueError: Invalid payload! Need prompt and completion or messages and response.

Is there a way to retrieve the original exception thrown by openai? Anyone faced a similar issue?

18 comments

LLogan M

Seems like azure isn't returning the data needed to count tokens

Maybe make sure you pip install -U llama-index-core llama-index-llms-openai llama-index-llms-azure-openai

LLautaro

Exactly! I performed the packages update as you propose, but the problem persists

LLogan M

Seems like this might be hiding another error

They only way this can happen is if the LLM errors out

LLogan M

remove the token counting handler, and see if you get a different error

LLautaro

Inspecting the get_llm_token_counts, the payload argument is of an Exception (thats ok!)

Plain Text

{<EventPayload.EXCEPTION: 'exception'>: BadRequestError('Error code: 400 - {\'error\': ...}

But the body of the function doesn't have a proper way of handling it as it does with EventPayload.PROMPT or EventPayload.MESSAGES, so in the else statement, the ValueError is raised

LLogan M

yea exactly, thats why I mentioned removing the token counter to see what the error actually was

LLautaro

The actual error es the BadRequestError thrown due to Azure content policy. Here I leave the last part of the stacktrace removing the token_counter from the Settings.callback.

Plain Text

File ".../python3.10/site-packages/openai/_base_client.py", line 1040, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'error': {'message': "The response was filtered due to the prompt triggering Azure OpenAI's content management policy. Please modify your prompt and retry. To learn more about our content filtering policies please read our documentation: https://go.microsoft.com/fwlink/?linkid=2198766", 'type': None, 'param': 'prompt', 'code': 'content_filter', 'status': 400, 'innererror': {'code': 'ResponsibleAIPolicyViolation', 'content_filter_result': {'hate': {'filtered': True, 'severity': 'high'}, 'jailbreak': {'filtered': False, 'detected': False}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'medium'}}}}}

LLogan M

Plain Text

The response was filtered due to the prompt triggering Azure OpenAI's content management policy. Please modify your prompt and retry. To learn more about our content filtering policies please read our documentation: https://go.microsoft.com/fwlink/?linkid=2198766

Makes sense

LLautaro

Sorry, what do you say makes sense?

LLogan M

azure gave you a content warning for the prompt you gave it 🤷‍♂️

LLogan M

thats the error

LLautaro

I think you misunderstood my problem, let me rephrase it maybe?
Our project is a chotbot powered by Azure models that some users use on their daily routine. We dont have control over the prompts the query, but by Azure policies they are checked by these content warning models.
In our business logic, we intercept this error (BadRequestError, code 400 and all the data it holds) and follow from there. Everything worked, even the token counter (by some reason!)

Now we updated llama-index, and this error is not being raised! Instead, the ValueError is raised, and we dont have way of get the data from the original BadRequestError.

LLautaro

Does this make sense to you?

LLautaro

Okey, so I believe this PR introduces this events streaming error. I believe the implementation misses to apply the corresponding logic on the TokenCounter side, to take into account the EventPayload.EXCEPTION and return 0 on the tokens counters (input, completion, etc) (This seems to be the default fallback result from before the merge of the PR)

LLogan M

even if the token counter returns zero though, the exception will still raise from the llm response

Attachment

LLogan M

its just one error masking another

LLogan M

I can update the token counter, but that isn't going to solve the original error

LLautaro

Yes, I understand. However as I mention, I prefer to get the exception from the streaming and not the one produced by the middle callback of the TokenCounter, that has a lower priority as I understand

Add a reply

Find answers from the community

Hi there! I recently updated llama-index