The community members are discussing how to calculate token usage for Gemini-based models, such as Gemini-Pro and Gemini-1.5-Flash-Latest. They explore using the tiktoken library for GPT models, but are unsure if it works for Gemini models. Some suggestions include using the AutoTokenizer from Hugging Face, but it's unclear if that's the same tokenizer used by Gemini. The community members also discuss the possibility of using Gemma, but realize it's only for Google open-source models, not Gemini. Ultimately, they find that using Vertex AI's GenerativeModel and a custom tokenizer function may be a solution, but acknowledge it's a "kind of silly" approach.
import tiktoken from llama_index.core.callbacks import CallbackManager, TokenCountingHandler token_counter = TokenCountingHandler( tokenizer=tiktoken.encoding_for_model("gpt-3.5-turbo").encode ) I use this to calculate token usage for gpt models. How do i calculate token usage for gemini based models like gemini-pro or gemini-1.5-flash-latest?