Find answers from the community

Updated 5 months ago

LLM calls

At a glance

The post asks how many LLM (Large Language Model) calls happen when using a single query engine query method, such as a retriever query engine with default parameters. The comments provide the following insights:

One community member explains that the number of LLM calls depends on factors like the top k, chunk size, and how much input the LLM can fit. They mention that LlamaIndex will "compact" the prompt and try to fit as much text into each LLM call. If all the text fits in one call, then it's a single LLM call with the default settings.

Another community member asks if it's different or the same for Agents, and the response is that with Agents, it's a bit different. Agents usually work in rounds or iterations, where each iteration would require an LLM call.

A final community member elaborates, stating that with Agents, there's a minimum of 3 LLM calls: one to read the chat history and latest message, one to call a tool, and a last one to either write a final response or continue the loop and call another tool.

There is no explicitly marked answer in the provided information.

Anyone here help me understand how many LLM calls happen when we use a single query engine query method? (say retriever query engine, default params)
L
V
a
4 comments
It depends on the top k, your chunk size, and how much input the LLM can fit

LlamaIndex will take all nodes and "compact" the prompt, stuffing as much text into each llm call

If all text fits in one, then it's a single LLM call. With default settings, this should be always true
Is it different or the same for Agents?
With Agents, its a bit different. Usually these work in rounds/iterations, where each iteration would require an LLM call.
Yea as Andrei mentioned, agents can involve more calls. At a minimum there's 3 -- one to read the chat history + latest message, to either write a response or call a tool. One to call the tool. And a last one to either write a final response (or continue the loop and call another tool)
Add a reply
Sign up and join the conversation on Discord