Find answers from the community

Updated 2 years ago

Responses

Are two querys and two responses involved in this? The first query being the question the user asks and passed to the first prompt. The bot gives a response, then that response is used by llama to generate a query for the second prompt, then that response being returned to the user? Thank you for the clarification.**
L
a
c
9 comments
Normally, all the text retrieved by the index does not fit into one llm call

So, llama Index refines and answer across multiple chunks

It gets an initial answer using the first chunk

Then, it sends its existing answer, some new context, and asks the llm to either update its existing answer using the new context, or just repeat its existing answer
It is continue conversation with AI to get best fitting answer. Some how I feel funny is , index.query is communicating with openai using native human language, not XML, json, edi etc .
Is the second prompt always the same or is it derived from the llama prompt and langchain prompt?
Im confused what happens to the prompt created by a langchain agent
Is it ignored or?
So, the langchain agent decides on it's own the initial query to llama index (maybe it's something like "What is a cat?")

llama index takes that query, gets the relevant nodes, and sends that to the LLM. If all the text from the nodes does not fit in a single LLM call, then once there is an initial answer, llama index asks the LLM again using the next piece of context + original query + previous answer. The LLM has to either update the existing answer using the new context, or repeat the existing answer back if the new context is not helpful

Maybe seeing the prompt templates will help make more sense, one sec
So here you can see the initial text_qa prompt (bottom) and the refine prompt (above). The initial query_str does not change
Attachment
image.png
This was extremely helpful thank you
Prompt is the core for llm
Add a reply
Sign up and join the conversation on Discord