Has anyone ever used OpenInferenceCallbackHandler with condense_plus_context chat engine? In the log, the response text is the output of my DEFAULT_CONDENSE_PROMPT. However, in the chat menu, I see to the response to the system prompt.
Hi, I am trying to use TokenCountingHandler with .as_chat_engine(...).astream_chat. I am getting 0 tokens as an outcome. I wonder if anyone ever faced (and hopefully solved this issue).