Find answers from the community

Updated 2 months ago

Invoices about graphics cards

At a glance

As a simple example: If i have invoices as documents, a desired question would be "Are there any invoices about graphics cards? "

3 comments

LLogan M

Thats correct, generated_dataset_from_nodes generates questions specific to each nodes. The generated dataset
a) tests semantic retrieval
b) tests llm generation

Questions like "Are there any invoices about graphics cards?" don't really make sense for vector retrieval? But in any case, to generate questions like these, you'd need context of the entire dataset (a summary?), and then you could just prompt an llm directly to generate questions

vvasilisp

Thank you for the quick response! I am trying to test the downstream performance of my rag, so i was thinking i need a mix of questions. Some that represent the all the documents or a subset. I feel like a summary will remove critical information. Is there a better way to do this?

vvasilisp

actually, i was trying to follow this: https://www.llamaindex.ai/blog/evaluating-the-ideal-chunk-size-for-a-rag-system-using-llamaindex-6207e5d3fec5

Add a reply