Find answers from the community

Updated 4 months ago

Condensation + System prompt

At a glance
I am building a chat engine that should 1) Call llm to condense the previous chat history + current question into a new question, 2) query the index with condensed question to get the results, and then finally 3) Call llm with a system prompt, condensed question and query results. How can I do that with llama_index? Do any of the chat modes support both the condensing as well as a system prompt?
K
4 comments
In llama_index, CondenseQuestionChatEngine doesn't allow for system prompt. And ContextChatEngine has no provision for condensing questions.
Kendra example that I shared allows for this -
Plain Text
retriever = AmazonKendraRetriever(index_id=kendra_index_id, region_name=region)

  prompt_template = """
  The following is a friendly conversation between a human and an AI. 
  The AI is talkative and provides lots of specific details from its context.
  If the AI does not know the answer to a question, it truthfully says it 
  does not know.
  {context}
  Instruction: Based on the above documents, provide a detailed answer for, {question} Answer "don't know" 
  if not present in the document. 
  Solution:"""
  PROMPT = PromptTemplate(
      template=prompt_template, input_variables=["context", "question"]
  )

  condense_qa_template = """
  Given the following conversation and a follow up question, rephrase the follow up question 
  to be a standalone question.

  Chat History:
  {chat_history}
  Follow Up Input: {question}
  Standalone question:"""
  standalone_question_prompt = PromptTemplate.from_template(condense_qa_template)

  qa = ConversationalRetrievalChain.from_llm(
        llm=llm, 
        retriever=retriever, 
        condense_question_prompt=standalone_question_prompt, 
        return_source_documents=True, 
        combine_docs_chain_kwargs={"prompt":PROMPT})
  return qa
This is great because it allows for condensing the question, and also tune the final prompt to LLM.
I tried replacing KendraRetriever with a LlamaIndexRetriever (from langchain), but it doesn't work. 🫣
Add a reply
Sign up and join the conversation on Discord