Find answers from the community

s
F
Y
a
P
Updated last month

Prompt

I'm encountering an issue where my retrieved context doesn't seem to be being sent to the llm correctly and in order to debug I need to inquire about the entire prompt that was sent to the llm (after the chats are transformed and the context is inserted). I basically want a way to see all of the text that gets sent to the llm when i call stream_chat like this:

Plain Text
query_stream = chat_service.stream_chat(
                    messages=all_messages,
                    use_context=True,
                )


The response of chat_service.stream_chat() is of type CompletionGen which only contains a list of the sources. I'd like to keep around a copy of the whole prompt that is sent to the llm for each invocation of stream_chat for debugging purposes.

Does anyone know how this might be done in llama-index without serious modifications to the framework code?

Also if theres some way to know for sure how the nodes + messages got composed into the prompt, that would also be sufficient.
L
s
2 comments
If you want the complete prompt sent to the llm, you'll need to implement a callback.

We have an existing callback that just prints the llm inputs/outputs. You can modify that into a custom callback if you wanted.
https://docs.llamaindex.ai/en/stable/module_guides/observability/observability.html#simple-llm-inputs-outputs

https://github.com/run-llama/llama_index/blob/b20675ae7d3fb2a61f220bb324399f62443624ef/llama_index/callbacks/simple_llm_handler.py#L7


We also have a ton of other obseravbility integrations. Personally I like arize

https://docs.llamaindex.ai/en/stable/module_guides/observability/observability.html#arize-phoenix
Relating to the context prompt, does anyone know if there is a way to change the prompt surrounding the provided context. The part that says “answer questions about the above document or if it’s not in there say idk”
Add a reply
Sign up and join the conversation on Discord