Hi guys is this a bug

At a glance

Hi guys, is this a bug?
====
from llama_index.evaluation.dataset_generation import DatasetGenerator

questionGenerator = DatasetGenerator(nodes=nodes[:5], num_questions_per_chunk=1)

eval_questions = questionGenerator.generate_questions_from_nodes()
====
From my understanding, this should generate 5 questions ( 1 question each for 5 nodes). But it gives me 54 questions, seems like the default num_quetions_per_chunk=10

If I set num_questions_per_chunk=2, it works as expectation by generating 5 * 2 = 10 questions.

2 comments

jjorenl

It looks like the num_questions_per_chunk parameter is passed straight into the prompt passed to the model: https://github.com/run-llama/llama_index/blob/85de3d9a503fec159962569fb77329a43af4594e/llama_index/evaluation/dataset_generation.py#L96

Plain Text

self.question_gen_query = (
  question_gen_query
  or f"You are a Teacher/Professor. Your task is to setup {num_questions_per_chunk} questions (...)

Maybe it decided to disobey your instructions 🙂

You might be able to provide a custom prompt using the custom_gen_query argument (e.g., "Your task is to setup one question (...)")

llarrygzlb

Thank you very much, I believe this is why.

Add a reply

Find answers from the community

Hi guys is this a bug