Find answers from the community

Updated 4 months ago

Hi there! I recently updated llama-index

Hi there! I recently updated llama-index to the current stable version. I use Azure OpenAI models and in our project when the moderation models were triggered, we would catch the BadRequestError re-raised from llama to identify the category flagged and apply some business logic.
Now with the update, when raised the BadRequestError, another one is raised afterwards:
Plain Text
File ".../python3.10/site-packages/llama_index/core/callbacks/token_counting.py", line 91, in get_llm_token_counts
    raise ValueError(
ValueError: Invalid payload! Need prompt and completion or messages and response.

Is there a way to retrieve the original exception thrown by openai? Anyone faced a similar issue?
L
L
18 comments
Seems like azure isn't returning the data needed to count tokens

Maybe make sure you pip install -U llama-index-core llama-index-llms-openai llama-index-llms-azure-openai
Exactly! I performed the packages update as you propose, but the problem persists
Seems like this might be hiding another error

They only way this can happen is if the LLM errors out
remove the token counting handler, and see if you get a different error
Inspecting the get_llm_token_counts, the payload argument is of an Exception (thats ok!)
Plain Text
{<EventPayload.EXCEPTION: 'exception'>: BadRequestError('Error code: 400 - {\'error\': ...}

But the body of the function doesn't have a proper way of handling it as it does with EventPayload.PROMPT or EventPayload.MESSAGES, so in the else statement, the ValueError is raised
yea exactly, thats why I mentioned removing the token counter to see what the error actually was
The actual error es the BadRequestError thrown due to Azure content policy. Here I leave the last part of the stacktrace removing the token_counter from the Settings.callback.
Plain Text
File ".../python3.10/site-packages/openai/_base_client.py", line 1040, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'error': {'message': "The response was filtered due to the prompt triggering Azure OpenAI's content management policy. Please modify your prompt and retry. To learn more about our content filtering policies please read our documentation: https://go.microsoft.com/fwlink/?linkid=2198766", 'type': None, 'param': 'prompt', 'code': 'content_filter', 'status': 400, 'innererror': {'code': 'ResponsibleAIPolicyViolation', 'content_filter_result': {'hate': {'filtered': True, 'severity': 'high'}, 'jailbreak': {'filtered': False, 'detected': False}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'medium'}}}}}
Plain Text
The response was filtered due to the prompt triggering Azure OpenAI's content management policy. Please modify your prompt and retry. To learn more about our content filtering policies please read our documentation: https://go.microsoft.com/fwlink/?linkid=2198766


Makes sense
Sorry, what do you say makes sense?
azure gave you a content warning for the prompt you gave it πŸ€·β€β™‚οΈ
thats the error
I think you misunderstood my problem, let me rephrase it maybe?
Our project is a chotbot powered by Azure models that some users use on their daily routine. We dont have control over the prompts the query, but by Azure policies they are checked by these content warning models.
In our business logic, we intercept this error (BadRequestError, code 400 and all the data it holds) and follow from there. Everything worked, even the token counter (by some reason!)

Now we updated llama-index, and this error is not being raised! Instead, the ValueError is raised, and we dont have way of get the data from the original BadRequestError.
Does this make sense to you?
Okey, so I believe this PR introduces this events streaming error. I believe the implementation misses to apply the corresponding logic on the TokenCounter side, to take into account the EventPayload.EXCEPTION and return 0 on the tokens counters (input, completion, etc) (This seems to be the default fallback result from before the merge of the PR)
even if the token counter returns zero though, the exception will still raise from the llm response
Attachment
image.png
its just one error masking another
I can update the token counter, but that isn't going to solve the original error
Yes, I understand. However as I mention, I prefer to get the exception from the streaming and not the one produced by the middle callback of the TokenCounter, that has a lower priority as I understand
Add a reply
Sign up and join the conversation on Discord