Find answers from the community

Updated 2 months ago

where is the correct place to print out

where is the correct place to print out the final prompt that's being sent to llm? would i need to leverage the callback manager in the RetrieverQueryEngine when assinign the variable vector_query_engine any good examples?

Plain Text
    
def _query_index(self, query_engine: RetrieverQueryEngine, query: str) -> RESPONSE_TYPE:
        embedded_query = Settings.embed_model.get_text_embedding(query)      
        response = query_engine.query(QueryBundle(query_str=query, embedding=embedded_query))
        
        return response

def _create_query_engine(self) -> RetrieverQueryEngine:
        vector_index = VectorStoreIndex.from_vector_store(vector_store=self.vector_store, 
                                                   embed_model=Settings.embed_model)
        
        vector_retriever = VectorIndexRetriever(index=vector_index, similarity_top_k=5)
        
        vector_query_engine = RetrieverQueryEngine(
            retriever=vector_retriever,
            response_synthesizer=self.response_synthesizer,
            node_postprocessors=[SimilarityPostprocessor(similarity_cutoff=0.50),],
        )
        
        vector_query_engine.update_prompts({"response_synthesizer:text_qa_template": self.qa_prompt_tmpl})
                
        return vector_query_engine

def query_rag(self, query: str) -> Dict[str, Any]:
        vector_query_engine = self._create_query_engine()

        response = self._query_index(query_engine=vector_query_engine, query=query)
L
s
5 comments
Plain Text
from llama_index.core import set_global_handler
set_global_handler("simple")
put that at the top of your code
I can get print outs of:

Plain Text
**********
Trace: query
    |_CBEventType.QUERY -> 9.163249 seconds
      |_CBEventType.RETRIEVE -> 0.006792 seconds
      |_CBEventType.SYNTHESIZE -> 9.156255 seconds
        |_CBEventType.TEMPLATING -> 6e-06 seconds
        |_CBEventType.LLM -> 2.308461 seconds
        |_CBEventType.TEMPLATING -> 1.1e-05 seconds
        |_CBEventType.LLM -> 1.531016 seconds
        |_CBEventType.TEMPLATING -> 7e-06 seconds
        |_CBEventType.LLM -> 1.433674 seconds
        |_CBEventType.TEMPLATING -> 6e-06 seconds
        |_CBEventType.LLM -> 1.585802 seconds
        |_CBEventType.TEMPLATING -> 7e-06 seconds
        |_CBEventType.LLM -> 2.289968 seconds
**********
That worked. Do I need to set the callback managers everywhere using Settings with that global handler then?
Add a reply
Sign up and join the conversation on Discord