yes, you're right, the
llm
passed to the
generate_question_context_pairs
function is used to generate synthetic questions for all of the
nodes
.
And you can update the instructions prompt too.
qa_dataset = generate_question_context_pairs(
nodes, llm=llm, num_questions_per_chunk=2,
qa_generate_prompt_tmpl=custom_instructions
)
the custom prompt has to have two variables,
context_str
and
num_questions_per_chunk
This is the default one:
DEFAULT_QA_GENERATE_PROMPT_TMPL = """\
Context information is below.
---------------------
{context_str}
---------------------
Given the context information and not prior knowledge.
generate only questions based on the below query.
You are a Teacher/ Professor. Your task is to setup \
{num_questions_per_chunk} questions for an upcoming \
quiz/examination. The questions should be diverse in nature \
across the document. Restrict the questions to the \
context information provided."
"""