Find answers from the community

Updated last year

generate code to count of token from response

At a glance

The post asks the community members to generate code to count the tokens from a response. In the comments, a community member provides a detailed example of how to use the TokenCountingHandler callback in the LlamaIndex library to count the embedding tokens, LLM prompt tokens, LLM completion tokens, and the total LLM token count. The community member also mentions that more details can be found in a notebook on GitHub.

@kapa.ai generate code to count of token from response
k
3 comments
Please stand by @Matthews as I search the knowledge base. I normally take about 20 seconds or so. :wave:
----------------------
To count the tokens from a response in LlamaIndex, you can use the TokenCountingHandler callback. Here's an example of how you can set it up and use it:

Plain Text
import tiktoken
from llama_index.llms import Anthropic
from llama_index import (
    SimpleDirectoryReader,
    VectorStoreIndex,
    ServiceContext,
    set_global_service_context,
)
from llama_index.callbacks import CallbackManager, TokenCountingHandler
import os

os.environ["ANTHROPIC_API_KEY"] = "YOUR_API_KEY"

# Setup the callback and the service context
token_counter = TokenCountingHandler(
    tokenizer=tiktoken.encoding_for_model("gpt-3.5-turbo").encode
)

callback_manager = CallbackManager([token_counter])

llm = Anthropic()

service_context = ServiceContext.from_defaults(
    llm=llm, callback_manager=callback_manager, embed_model="local"
)

# Set the global default
set_global_service_context(service_context)


After setting up, you can count the tokens from a response as follows:

Plain Text
# Reset the counts
token_counter.reset_counts()

# Query
query_engine = index.as_query_engine(similarity_top_k=4)
response = query_engine.query("Your query here")

# Print the counts
print(
    "Embedding Tokens: ",
    token_counter.total_embedding_token_count,
    "\n",
    "LLM Prompt Tokens: ",
    token_counter.prompt_llm_token_count,
    "\n",
    "LLM Completion Tokens: ",
    token_counter.completion_llm_token_count,
    "\n",
    "Total LLM Token Count: ",
    token_counter.total_llm_token_count,
    "\n",
)


Please replace "Your query here" with your actual query.

This code will print the count of embedding tokens, LLM prompt tokens, LLM completion tokens, and the total LLM token count.
For more details, you can refer to the Token Counting Handler notebook () on GitHub.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
Add a reply
Sign up and join the conversation on Discord