Hi, I am trying to use TokenCountingHandler with .as_chat_engine(...).astream_chat. I am getting 0 tokens as an outcome. I wonder if anyone ever faced (and hopefully solved this issue).
Yes, tiktoken takes Model_name not llama-index-llms.
If you are referring to this as setting up: Settings.llm = OpenAI(model="gpt-3.5-turbo", temperature=0.2) Settings.callback_manager = CallbackManager([token_counter])
Since I am using callback manager directly in LLM, I didn't think it would be necessart to set it up in the settings. In fact I am not using "from llama_index.core import Settings"
I am loading index as VectorStoreIndex.from_vector_store(...) and using this as as_chat_engine() for condense plus context.
Finally I am calling astream_chat with index.as_chat_engine