TokenCountingHandler
? Does the get_llm_token_counts
method invoked expect a usage
field in the response from a model? Is it possible to implement something that doesn't rely on this?usage
field not being returned by the model server: https://gist.github.com/edhenry/d4ed1c1ddc4734737604a1ab515b527etokenizer
is maybe a misleading name -- it just has to be callable function that given a string, returns a listchat
method in the ChatResponse
. Definitely something to keep in mind and/or I'll have a poke at modifying as I will want to return the raw response, with token counting callbacks, but won't have the usage
field available - though I suppose I could just add that to my model server API 🤷