Find answers from the community

Updated last year

Chat Engine missing Callbacks

At a glance
I'm working on building a callback handler to integrate Langfuse (LLM observability) with Llama Index (See https://github.com/langfuse/langfuse/issues/188), but I'm running into issues with missing callback events from a chat engine.

I have a basic Llama Index chat engine (chatmode is Context) which queries Pinecone and tries to answer questions based on that context. However, I notice that the chat engine isn't emitting certain callbacks (specifically related to retrieval). I hooked up the debug handler and I pasted the output below. I know for a fact (through print statements) that Pinecone is getting hit when chatting with a chat engine.

query callbacks:
Plain Text
**********
Trace: query
    |_CBEventType.QUERY ->  3.858602 seconds
      |_CBEventType.RETRIEVE ->  1.956498 seconds
        |_CBEventType.EMBEDDING ->  0.702143 seconds
      |_CBEventType.SYNTHESIZE ->  1.901565 seconds
        |_CBEventType.TEMPLATING ->  5e-05 seconds
        |_CBEventType.LLM ->  1.889486 seconds
**********

versus
chat callbacks:
Plain Text
**********
Trace: chat
     |_CBEventType.LLM ->  7.187319 seconds
**********


Anyone know why this might be happening?
C
L
5 comments
@Logan M is this expected behavior for LlamaIndex chat engines? (I'm using v0.9.14). Are there any plans on the roadmap to flesh out the chat callbacks?
There are plans to expand callbacks, just haven't gotten around to it yet.

Should be relatively easy in this case, just need to add some trace decorators to the chat endpoints -- feel free to open a PR, otherwise I'll get to it at some point
got it, thanks for the reply πŸ™‚
At the moment, i got something working with just the LLM callback that's good enough (since it has most of the info I need), but I'll poke around the callback part codebase to see if its straightforward enough for me to draft a PR!
Add a reply
Sign up and join the conversation on Discord