It s frustrating because the response

At a glance

It's frustrating because the response payload is literally

Plain Text

  "usage": {
    "completion_tokens": 9,
    "prompt_tokens": 35,
    "total_tokens": 44
  }

Which has all the info I need, but the LlamaIndex abstraction makes it harder 😂

31 comments

LLogan M

A curse of needing to support many LLMs -- openai is the only one providing these counts really

iisaackogan

Already figured it out

iisaackogan

Subclassed and overridden

LLogan M

Lol that would be my first instinct too, nice

If you wanted to ride within the rules of the library, there's probably a way to give chat it's distinct callback handler for each user, then you have the token counts for each user request

iisaackogan

@Logan M Guess I'm lucky!!

iisaackogan

Plain Text

@dataclass
class CriaChatResponse(AgentChatResponse):
    raw: Optional[dict] = field(default_factory=dict)


class CriaChatEngine(ContextChatEngine):

    @classmethod
    def from_index(cls, index: BaseIndex, **kwargs):

        index.as_query_engine()
        return cls.from_defaults(
            retriever=index.as_retriever(**kwargs),
            **kwargs,
        )

    async def achat(
        self, message: str, chat_history: Optional[List[ChatMessage]] = None
    ) -> CriaChatResponse:
        """
        Should maintain parity with superclass method.
        """

        if chat_history is not None:
            self._memory.set(chat_history)
        self._memory.put(ChatMessage(content=message, role="user"))

        context_str_template, nodes = await self._agenerate_context(message)
        prefix_messages = self._get_prefix_messages_with_context(context_str_template)
        all_messages = prefix_messages + self._memory.get()

        chat_response = await self._llm.achat(all_messages)
        ai_message = chat_response.message
        self._memory.put(ai_message)

        return CriaChatResponse(  # Custom response with a bit more info
            response=str(chat_response.message.content),
            sources=[
                ToolOutput(
                    tool_name="retriever",
                    content=str(prefix_messages[0]),
                    raw_input={"message": message},
                    raw_output=prefix_messages[0],
                )
            ],
            source_nodes=nodes,
            raw=chat_response.raw  # Add raw payload info
        )

iisaackogan

That's all I had to do lol

iisaackogan

might benefit honestly from making that change in the lib tho

iisaackogan

AgentChatResponse could return the raw payload dict

iisaackogan

Most LLMs have a raw response

iisaackogan

In fact all have a raw response I would assume 😂

LLogan M

Yea that's fair. As you can see it's there, just not passed to the top level 😅

iisaackogan

Yup

iisaackogan

Ooh and quick question, is there any overhead for creating a query engine? Anything loaded or whatever that is high in CPU?

LLogan M

Creating a query engine should be essentially free 🤔

iisaackogan

Perfect. I thought so, but ya never know

iisaackogan

Because I don't want to keep em in memory

iisaackogan

Some tasks don't require chats, just a one-off query

iisaackogan

I'd rather not keep a query engine in memory 24/7, and just create it whenever a query is made

iisaackogan

But then of course the main method we use is a subclassed ChatContextEngine since we're building chatbots

LLogan M

Btw, we are starting a project soon to start wrangling our async stuff into order.

Since you seem to be an extreme power user of the library, it might be good to find a time to chat about your experience/point points so far.

iisaackogan

Sure I'd be open for that

iisaackogan

This lib has been a huge help in taking what would have been an impossibly large project and turning it into a managable one..

LLogan M

Cool! Do you have time sometime this week? I'm pretty free after today

(I'm in CST time btw)

iisaackogan

Wed and Fri this week pretty much all day, after that it gets a bit harder until September 6th. Can still probably squeeze something, would just depend more on your availability so I can see if any matches

LLogan M

Sweet, how about tomorrow (Wednesday) at like 3pm CST?

iisaackogan

Sounds great

LLogan M

What's your email? I can send a link

iisaackogan

koganisa@yorku.ca

iisaackogan

See ya then

LLogan M

Yea see ya then! 💪

Add a reply

Find answers from the community

It s frustrating because the response