Find answers from the community

Updated 12 months ago

Within the LLAMAIndex Framework, how do I see the complete query sent to the LLM in order

At a glance

The community members are discussing how to monitor the prompt formatting and the full chain of communication within the LlamaIndex Framework. The first comment provides an example of how to enable the verbose mode for the query engines to see the complete query sent to the LLM. The second comment indicates that this does not display the full chain of communication, and the community members provide a solution using the QueryPipeline interface with the verbose parameter set to True. However, there is no explicitly marked answer on how to print the prompt_tmpl for auditing.

@kapa.ai Within the LLAMAIndex Framework, how do I see the complete query sent to the LLM in order to monitor prompt formatting? Is there a verbose mode for the query engines?
k
c
6 comments
Please stand by @cablecutter as I search the knowledge base. I normally take about 20 seconds or so. :wave:
----------------------
Yes, within the LlamaIndex Framework, you can monitor the prompt formatting by enabling the verbose mode for the query engines. This can be done by setting the verbose parameter to True when building and configuring a query engine. Here is an example of how to do it:

Plain Text
query_engine = index.as_query_engine(
    response_mode="tree_summarize",
    verbose=True,
)

This line of code will directly build and configure a query engine from an index in just one line of code. The verbose=True part is what enables the verbose mode, allowing you to see the complete query sent to the LLM.

You can find more details about this in the LlamaIndex documentation ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
@kapa.ai query_engine = index.as_query_engine(
response_mode="tree_summarize",
verbose=True,
) Does not display the full chain when pprinting the response. How do I see the full chain of communication?
Let me check the knowledge base for you. One moment! :books:
----------------------
To see the full chain of communication, you can use the QueryPipeline interface with the verbose parameter set to True. This will allow you to see the full chain of communication including the prompt formatting and the LLM's response. Here is an example:

Plain Text
from llama_index.query_pipeline import QueryPipeline

p = QueryPipeline(chain=[component, prompt_tmpl, llm], verbose=True)
output = p.run(movie="Love Actually")
print(str(output))


In this example, component, prompt_tmpl, and llm are the components of the query pipeline. The verbose=True part is what enables the verbose mode, allowing you to see the full chain of communication.

You can find more details about this in the LlamaIndex documentation ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
@kapa.ai Great, now how do I print the prompt_tmpl for auditing?
Add a reply
Sign up and join the conversation on Discord