import tiktoken

At a glance

The community member is using the TokenCountingHandler from the llama_index library to track the token usage of the OpenAI language model. They are wondering if they can use the same token counting handler with the BAAI/bge-small model instead of OpenAI.

In the comments, another community member suggests changing the tokenizer to one from Hugging Face, and provides an example of how to load the tokenizer for the BAAI/bge-small-en-v1.5 model. However, there is no explicitly marked answer on how to use the token counting handler with the BAAI/bge-small model.

nnavya1260

import tiktoken
from llama_index.core.callbacks import CallbackManager, TokenCountingHandler
from llama_index.llms.openai import OpenAI
from llama_index.core import Settings

token_counter = TokenCountingHandler(
tokenizer=tiktoken.encoding_for_model("gpt-3.5-turbo").encode
)

Settings.llm = OpenAI(model="gpt-3.5-turbo", temperature=0.2)
Settings.callback_manager = CallbackManager([token_counter])
Here, can we use this tokencountinghandler with BAAI/bge small model instead of open ai? If so how?

2 comments

LLogan M

Change the tokenizer to something from huggingface

LLogan M

Plain Text

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("BAAI/bge-small-en-v1.5")

Add a reply

Find answers from the community

import tiktoken