Find answers from the community

Updated 2 months ago

Prompt issues

hi guys when using any refine prompt sometime the output ends up being "The new context does not provide any additional information that would require a refinement of the original answer. The original answer remains accurate and complete."
rather than providing the original response. Any idea whats happening here? its happening both on the Tree and The Simple vector
------------
Given the new context, refine the original answer to better answer the question. If the context isn't useful, output the original answer again.
DEBUG:llama_index.indices.response.response_builder:> Refined response: The new context does not provide any additional information that would require a refinement of the original answer. The original answer remains accurate and complete.
Refined response: The new context does not provide any additional information that would require a refinement of the original answer. The original answer remains accurate and complete.
INFO:llama_index.token_counter.token_counter:> [query] Total LLM token usage: 9426 tokens

'Tree Index':
{"query_str": user_question,
# "mode": "S",
"service_context": service_context,
"verbose": True
# "use_async": True
},
'Simple Vector Index':
{"query_str": user_question,
"mode": "default",
"response_mode": "tree_summarize",
"similarity_top_k": 5,
"service_context": service_context,
"verbose": True
# "use_async": True
},
L
s
21 comments
Yeaaaa are you using gpt-3.5? Openai seems to have downgraded the model recently, which is causing this problem.

When all the text doesn't fit into one llm call, it refines the answer across many llm calls

I've actually been working on a new refine template, if you want to test it. Let me grab the code
Plain Text
from langchain.prompts.chat import (
    AIMessagePromptTemplate,
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
)

from llama_index.prompts.prompts import RefinePrompt

# Refine Prompt
CHAT_REFINE_PROMPT_TMPL_MSGS = [
    HumanMessagePromptTemplate.from_template("{query_str}"),
    AIMessagePromptTemplate.from_template("{existing_answer}"),
    HumanMessagePromptTemplate.from_template(
        "I have more context below which can be used "
        "(only if needed) to update your previous answer.\n"
        "------------\n"
        "{context_msg}\n"
        "------------\n"
        "Given the new context, update the previous answer to better "
        "answer my previous query."
        "If the previous answer remains the same, repeat it verbatim. "
        "Never reference the new context or my previous query directly.",
    ),
]


CHAT_REFINE_PROMPT_LC = ChatPromptTemplate.from_messages(CHAT_REFINE_PROMPT_TMPL_MSGS)
CHAT_REFINE_PROMPT = RefinePrompt.from_langchain_prompt(CHAT_REFINE_PROMPT_LC)
...
index.query("my query", similarity_top_k=3, refine_template=CHAT_REFINE_PROMPT)


Just need to set the refine_template in the query kwargs you shared to use it in a graph
yes i am! oh that would be great i was going crazy trying to figure it out. I thought it was chunking, does chunking happen automatically or is it better to chunk the data during index creation?
Chunking happens during index construction yea. But the problem isn't entirely related to that 😅 just the llm not following instructions
Hopefully the above code helps. Feel free to try and tune it more too haha
right now on i'm doing chunk_limit = 1000 what do u generally use for a create_refine ?
Yea that chunk size is fine 💪especially for embeddings, that seems to be about the sweet spot
and then for service context i have the below, anything jump out as incorrect?
max_input_size = 3000
num_output = 1000
max_chunk_overlap = 20
chunk_size_limit = 1024
# embed_model = LangchainEmbedding(HuggingFaceEmbeddings())

prompt_helper = PromptHelper(
max_input_size=max_input_size
, num_output=num_output
, max_chunk_overlap=max_chunk_overlap
, chunk_size_limit=chunk_size_limit
)
llm_predictor = LLMPredictor(llm=ChatOpenAI(temperature=0, model_name="gpt-3.5-turbo", request_timeout=1500))

return ServiceContext.from_defaults(
llm_predictor=llm_predictor
, prompt_helper=prompt_helper

# ,embed_model=embed_model
)
You can probably change the max input size back to 4096, unless you lowered it for a specific reason.

Also maybe set the chunk_size_limit in the service_context itself, in addition to the prompt helper.

Chunking happens twice, when the index is created, and during queries. Usually you'll want it to be the same size in both
also - i don't think its picking it up the new chat tempalte
'Tree Index':
{"query_str": user_question,
# "mode": "S",
"service_context": service_context,
"verbose": True,
"refine_template": "CHAT_REFINE_PROMPT",
"use_async": True
},
AttributeError: 'str' object has no attribute 'partial_format'
Hmm, did you copy every line I sent? There's kinda 3 steps/variables, the initial list of messages, the langchain prompt, and then the final llama index refine prompt that gets used in the kwargs
from langchain.prompts.chat import (
AIMessagePromptTemplate,
ChatPromptTemplate,
HumanMessagePromptTemplate,
)

from llama_index.prompts.prompts import RefinePrompt

Refine Prompt

CHAT_REFINE_PROMPT_TMPL_MSGS = [
HumanMessagePromptTemplate.from_template("{query_str}"),
AIMessagePromptTemplate.from_template("{existing_answer}"),
HumanMessagePromptTemplate.from_template(
"I have more context below which can be used "
"(only if needed) to update your previous answer.\n"
"------------\n"
"{context_msg}\n"
"------------\n"
"Given the new context, update the previous answer to better "
"answer my previous query."
"If the previous answer remains the same, repeat it verbatim. "
"Never reference the new context or my previous query directly.",
),
]


CHAT_REFINE_PROMPT_LC = ChatPromptTemplate.from_messages(CHAT_REFINE_PROMPT_TMPL_MSGS)
CHAT_REFINE_PROMPT = RefinePrompt.from_langchain_prompt(CHAT_REFINE_PROMPT_LC)
You put quotes around the variable name in the kwargs
"CHAT_REFINE_PROMPT"

just remove the quotes there
ah stupid me hahaha
with this change though tree index is far more powerful
with that in mind

how can i set the summary text here, right now its just the first couple hundred characters it seems:

Provide choice in the following format: 'ANSWER: <number>' and explain why this summary was selected in relation to the question.

>[Level 0] current prompt template: Some choices are given below. It is provided in a numbered list (1 to 4),where each item in the list corresponds to a summary.
Glad it helped! Maybe I should add that template in a PR lol a few people have found it to be better

So in a tree index, it builds a hierarchy of summaries automatically, so no way to set it specifically 🤔 then during the query, it kind of traverses the tree, which is what you are seeing there
Add a reply
Sign up and join the conversation on Discord