token_counter = TokenCountingHandler( tokenizer=tiktoken.encoding_for_model("gpt-3.5-turbo").encode ) callback_manager = CallbackManager([token_counter]) Settings.callback_manager = CallbackManager([token_counter])
llm = OpenAI(temperature=0.0, model='gpt-3.5-turbo-0125', max_tokens=4000)
slides_query_engine = RetrieverQueryEngine.from_args( retriever=slides_hybrid_retriever, node_postprocessors=[cohere_rerank], llm=llm, #callback_manager = callback_manager, embed_model=embed_model, )
Settings.llm =llm
and not passing it into the query enginellm = OpenAI(...., callback_manager=callback_manager)
BaseLLM
class needs to pull the callback manager from the global settings by default to make it work in this case