Find answers from the community

Updated 2 months ago

Hi, I am trying to use

Hi, I am trying to use TokenCountingHandler with .as_chat_engine(...).astream_chat. I am getting 0 tokens as an outcome. I wonder if anyone ever faced (and hopefully solved this issue).

Token Counter:
token_counter = TokenCountingHandler( tokenizer=tiktoken.encoding_for_model('gpt-4').encode, verbose=False )
LLM:
AzureOpenAI(..., callback_manager= CallbackManager(token_counter])
d
L
a
5 comments
on first glance, model argument for encoding_for_model takes a model object, not a string.

instead of "gpt-4", do OpenAI(model="gpt-4")

needs to be imported like so:
from llama_index.llms.openai import OpenAI
I don't think thats true? tiktoken is an external class, it doesn't accept llama-index llms

Probably in your code, you are not passing in or setting up the LLM properly after attaching the callback manager?
Yes, tiktoken takes Model_name not llama-index-llms.

If you are referring to this as setting up:
Settings.llm = OpenAI(model="gpt-3.5-turbo", temperature=0.2)
Settings.callback_manager = CallbackManager([token_counter])

Since I am using callback manager directly in LLM, I didn't think it would be necessart to set it up in the settings. In fact I am not using "from llama_index.core import Settings"

I am loading index as VectorStoreIndex.from_vector_store(...)
and using this as as_chat_engine() for condense plus context.

Finally I am calling astream_chat with index.as_chat_engine
Hmm, what if you set

Plain Text
Settings.llm = OpenAI(model="gpt-3.5-turbo", temperature=0.2, callback_manager=CallbackManager([token_counter]))
sorry, my bad, i misread my own code
Add a reply
Sign up and join the conversation on Discord