Find answers from the community

Updated 3 months ago

T9ne counting

I am trying to figure out how to get an accurate token count when using AWS Bedrock. The callbacks do not seem to match what Bedrock is returning and I need to capture this accurately for billing reasons, is there a way to solve this?

Bedrock - 'X-Amzn-Bedrock-Output-Token-Count': '247', 'X-Amzn-Bedrock-Input-Token-Count': '836'}

My Callback -
Embedding Tokens: 10
LLM Prompt Tokens: 1771
LLM Completion Tokens: 484
Total LLM Token Count: 2255

7 comments

LLogan M

Did you set the callback to use an appropriate tokenizer for your model?

Alternatively, you can write your own callback to grab those counts from the API response yourself (if they are on the api response)

ddenen99

i did use a proper tokenizer (that matches the embedding model) but they dont match. I see that the LLM init actually takes a CallbackManager , is there an example of how that works?

LLogan M

You could copy/modify the existing token counter to calculate your own token counts
https://github.com/run-llama/llama_index/blob/9c30dbe5d2a3868d45da3d29468db583a4986ecb/llama_index/callbacks/token_counting.py#L91

Just have to implement your own BedrockTokenCounter or something, and modify this function
https://github.com/run-llama/llama_index/blob/main/llama_index/callbacks/token_counting.py#L23

ddenen99

awesome thank you!

ddenen99

do you know if the callback has access to the API response ?

ddenen99

not sure what else is in the payload param

LLogan M

It does, on the raw attribue -- see here
https://github.com/run-llama/llama_index/blob/9c30dbe5d2a3868d45da3d29468db583a4986ecb/llama_index/callbacks/token_counting.py#L52

Add a reply