Find answers from the community

Updated 2 months ago

TokenCounter doesn't count tokens.

TokenCounter doesn't count tokens.
At some moment, it stopped working. Here is the code:
Plain Text
token_counter = TokenCountingHandler(
            tokenizer=tiktoken.encoding_for_model(model_name).encode,
            verbose=False  
        )
callback_manager = CallbackManager([token_counter])
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, chunk_size=project_chunk_size, 
                                                        callback_manager=callback_manager)
index = VectorStoreIndex.from_vector_store(vector_store, service_context)
retriever = index.as_retriever(verbose=True, chat_mode="context", similarity_top_k=similarity_top_k)
custom_chat_engine = CustomContext.from_defaults(
                                    retriever=retriever,
                                    memory=chatmemory, 
                                    context_template=generate_context_template(),
                                    system_prompt=prepared_system_prompt,
                                    node_postprocessors=[CustomPostprocessor(
                                            context_limit, query_text + prepared_system_prompt, project.db_name, None)])
response = custom_chat_engine.chat(query_text, chat_history=chat_history)
tokens_used = token_counter.total_llm_token_count # <----- ALWAYS ZERO

Thanks!
L
S
9 comments
seems kinda sus

Seems like CustomContext is your own class? Is it using the service context?
Yeah, it's inherits from ContextChatEngine and rewrites just one method. Can it be the reason?
Plain Text
class CustomContext(ContextChatEngine):
    def _get_prefix_messages_with_context(self, context_str: str) -> List[ChatMessage]:
        """Get the prefix messages with context."""
        # ensure we grab the user-configured system prompt
        system_prompt = ""
        prefix_messages = self._prefix_messages
        if (
            len(self._prefix_messages) != 0
            and self._prefix_messages[0].role == MessageRole.SYSTEM
        ):
            system_prompt = str(self._prefix_messages[0].content)
            prefix_messages = self._prefix_messages[1:]

        context_str_w_sys_prompt = system_prompt.strip() + context_str # Opporsite order
        return [
            ChatMessage(content=context_str_w_sys_prompt, role=MessageRole.SYSTEM),
            *prefix_messages,
        ]
ah I see, for chat engine then, pass the LLM into it

Plain Text
custom_chat_engine = CustomContext.from_defaults(llm=service_context.llm, ...)
thanks, let me try and see if it helps!
No, it didn't help. I see in the ContextChatEngine class, it's already obtained from the service context (context.py, line 75):
Plain Text
llm = service_context.llm_predictor.llm
Hmm. I tried my own code and it works fine

Plain Text
from llama_index.callbacks import CallbackManager, TokenCountingHandler
from llama_index.chat_engine import ContextChatEngine
from llama_index.llms import OpenAI
from llama_index import Document, ServiceContext, VectorStoreIndex
import tiktoken

token_counter = TokenCountingHandler(
    tokenizer=tiktoken.encoding_for_model("gpt-3.5-turbo").encode, verbose=False
)
callback_manager = CallbackManager([token_counter])
service_context = ServiceContext.from_defaults(
    llm=OpenAI(model="gpt-3.5-turbo"), chunk_size=512, callback_manager=callback_manager
)

index = VectorStoreIndex.from_documents(
    [Document.example()], service_context=service_context
)

chat_engine = index.as_chat_engine(
    verbose=True, chat_mode="context", similarity_top_k=2
)

response = chat_engine.chat("Tell me something about LLMs")

print(token_counter.total_llm_token_count)
What should I check to figure it out? Maybe debug inside and see how it's calculating the token? But I'm not sure where to look at...
add the service context here

Plain Text
custom_chat_engine = CustomContext.from_defaults(
                                    retriever=retriever,
                                    memory=chatmemory, 
                                    context_template=generate_context_template(),
                                    system_prompt=prepared_system_prompt,
                                    service_context=service_context,
                                    node_postprocessors=[CustomPostprocessor(
                                            context_limit, query_text + prepared_system_prompt, project.db_name, None)])
It worked, yay! πŸ™‚
Add a reply
Sign up and join the conversation on Discord