Find answers from the community

Updated last year

Hello! I got a ContextChatEngine with a

At a glance
Hello! I got a ContextChatEngine with a SentenceTransformerRerank as node_postprocessors for Reranking purposes. Before retrieving and reranking I would like to use a condensed query based upon last user message + history. Is there a way I can use the condensed query for the retrieval / reranking purpose but for the LLM request, I would like the original query to be send to the LLM including the retrieved context. Thank you very much!
B
r
3 comments
A little more context:
Plain Text
node_postprocessors = [
    MetadataReplacementPostProcessor(target_metadata_key="window"),
]
reranker = SentenceTransformerRerank(
    model="cross-encoder/msmarco-MiniLM-L6-en-de-v1",
    top_n=reranker_top_n,
)
node_postprocessors.append(reranker)


cce = ContextChatEngine.from_defaults(
    retriever=vector_index_retriever,
    service_context=self.service_context,
    node_postprocessors=node_postprocessors,
    system_prompt=system_prompt,
    context_template=CUSTOM_CONTEXT_TEMPLATE,
)
Try following this pseudocode:

Plain Text
# Step 1: Condense the Query
condensed_query = condense_query_function(last_user_message, conversation_history)

# Step 2: Retrieve and Rerank
# Assuming 'cce' is your ContextChatEngine instance
retrieved_nodes = cce.retrieve(condensed_query)
reranked_nodes = cce.rerank(retrieved_nodes)

# Step 3: Restore Original Query
# 'original_query' is the full query including the last user message and history
# You would need to ensure that the ContextChatEngine or the underlying service
# can accept and use the original query in the final LLM request.

# Step 4: Send to LLM
response = cce.query_llm(original_query, reranked_nodes)
Thank you very much - so best step would probably to make a wrapper class / custom class similar to the ContextChatEngine to be able to make a chat / stream_chat method with the required changes. Maybe I don't see the obvious way. I get the retrieved_nodes / reranked_nodes by calling it like this:
Plain Text
retrieved_nodes = chat_engine._retriever.retrieve(condensed_question)
rerank_processors = [processor for processor in chat_engine._node_postprocessors if isinstance(processor, SentenceTransformerRerank)]
if (len(rerank_processors) > 0):
         retrieved_nodes = rerank_processors[0]._postprocess_nodes(retrieved_nodes, condensed_question)


But I would still have to integrate the whole context_str_template stuff / prefix_messages that happens in the chat / stream_chat methods of the ContextChatEngine class
Add a reply
Sign up and join the conversation on Discord