Find answers from the community

Updated last year

llama_index/llama-index-core/llama_index...

At a glance

The community member is trying to generate questions/answers pairs from a document and is using the RagDatasetGenerator class from the llama_index library. They have found two functions in the documentation, agenerate_dataset_from_nodes and generate_dataset_from_nodes, that seem to do the same thing. The community member is wondering if the second function, which uses asyncio.run, should be run asynchronously. They have encountered an error "sys:1: RuntimeWarning: coroutine 'RagDatasetGenerator.agenerate_dataset_from_nodes' was never awaited" which they think is coming from the second function.

In the comments, another community member suggests that the pattern of using asyncio.run(self.agenerate_dataset_from_nodes()) is used in more than a few places, implying that it is a common way to run the async function properly.

There is no explicitly marked answer in the post or comments.

Useful resources
I’m trying to generate questions/answers pairs from a document. In the documentation here https://github.com/run-llama/llama_index/blob/3823389e3f91cab47b72e2cc2814826db9f98e32/llama-index-core/llama_index/core/llama_dataset/generator.py#L236 there is both a

async def agenerate_dataset_from_nodes(self) -> LabelledRagDataset: """Generates questions for each document.""" return await self._agenerate_dataset(self.nodes, labelled=True)

function and a

def generate_dataset_from_nodes(self) -> LabelledRagDataset: """Generates questions for each document.""" return asyncio.run(self.agenerate_dataset_from_nodes())

function. Shouldn’t this second function not be run asynchronously? I was trying to generate questions/answers in a non-asynchronously way but kept running into an error "sys:1: RuntimeWarning: coroutine 'RagDatasetGenerator.agenerate_dataset_from_nodes' was never awaited" which is coming from this function I think.
L
1 comment
hmmm, thats kind of sus. asyncio.run(self.agenerate_dataset_from_nodes()) should be running the async function properly 🤔 This pattern is used in more than a few places
Add a reply
Sign up and join the conversation on Discord