Find answers from the community

Updated 3 months ago

In a simple scenario such as:

In a simple scenario such as:
Plain Text
vector_store = FaissVectorStore.from_persist_dir(FLAGS.vector_store_dir)
storage_context = StorageContext.from_defaults(vector_store=vector_store, persist_dir=FLAGS.vector_store_dir)
index = load_index_from_storage(storage_context=storage_context)
query_engine = index.as_query_engine(similarity_top_k = FLAGS.similarity_top_k)

When response is generated in a way like response = query_engine.query("My question!"), How'd I actually get the whole prompt (containing system message and all parsed texts)?
I thought this was conceptually easy but couldn't figure it out just from codebase...
W
Y
3 comments
You can get all the nodes which have been used to create the final response from the response object.

print(response.source_nodes)

Now for the prompt: You can either turn verbose=True in your query engine or use observability tool like langfuse or Arize phoenix.

https://docs.llamaindex.ai/en/stable/examples/callbacks/LangfuseCallbackHandler/?h=lang

There is a new Instrumentation module that you can use : https://docs.llamaindex.ai/en/stable/examples/instrumentation/basic_usage/?h=instrum

This is super easy to implement
I guess what I wanted to know was how to print something like this: https://cloud.langfuse.com/project/cltipxbkn0000cdd7sbfbpovm/traces/96e3c191-4c90-49b9-a61b-be55e8477129?observation=11c40e83-868c-4680-a204-5307b3709541
without the need of langfuse. But I guess this isn't really possible without making custom classes etc, as prompt fusing with context with the system prompt is handled from backend without much native observability... I'll try langfuse then
OK Langfuse is pretty easy to self host and extremely informative, just what I wanted. Thanks!!
Add a reply
Sign up and join the conversation on Discord