Hi all. I have a question. why using service_context in CondensePlusContextChatEngine to serve for token is not working. When I run it always get 0 value
response = chat_engine.chat(input_text)
print(str(token_counter.total_llm_token_count))
If we do it like that when there are many requests at the same time, will it be affected? When using service_context with RetrieverQueryEngine.from_args(retriever, text_qa_template=prompt_tmpl, service_context=service_context), I get reasonable results, but with CondensePlusContextChatEngine, I don't.
@WhiteFang_Jr I've tried the method you mentioned. It works well if I run it from the query function. However, if I call the query function multiple times at once, it throws an error.