Find answers from the community

Updated 4 months ago

Nest

At a glance

A community member encountered an issue with the SimpleDirectoryReader and TitleExtractor transformation, where the async calls were conflicting with the asyncio call in IngestionPipeline.run, causing an error. They edited their cache file to use nest_asyncio.apply(), but found it counterintuitive. The community members discussed the issue, with one suggesting using actual async entry points like await pipeline.arun(..) to avoid the issue. The community member tried this approach and confirmed that it worked.

Useful resources
Hey, y'all! I ran into an issue using the SimpleDirectoryReader with the TitleExtractor transformation and I wanted to run it by y'all before raising a GitHub issue.

The async calls in the method are fighting with the asyncio call in IngestionPipeline.run and throwing this error:

Plain Text
python 
def asyncio_run(coro: Coroutine) -> Any:
    """Gets an existing event loop to run the coroutine.

    If there is no existing event loop, creates a new one.
    """
    try:
        loop = asyncio.get_running_loop()
        if loop.is_running():
            raise RuntimeError(
                "Nested async detected. "
                "Use async functions where possible (`aquery`, `aretrieve`, `arun`, etc.). "
                "Otherwise, use `import nest_asyncio; nest_asyncio.apply()` "
                "to enable nested async or use in a jupyter notebook.\n\n"
                "If you are experiencing while using async functions and not in a notebook, "
                "please raise an issue on github, as it indicates a bad design pattern."
            )
        else:
            return loop.run_until_complete(coro)
    except RuntimeError:
        return asyncio.run(coro)


I edited my cache file to just use nest_asyncio.apply() here instead of throwing the error, but that's kinda counterintuitive for the long term.
Have y'all
a) seen this before?
b) found a decent workaround?

If there's a different transform that I could use to acheive the same thing without running up against that error, that'd be great (I haven't found one)
L
H
5 comments
Applying nest asyncio is pretty standard, especially if you are running in a notebook

If you want to avoid that, I recommend using actual async entry points like await pipeline.arun(..)
I wasn't running in a notebook. We have an app hosted in Azure we're using to ingest content from different sources. I'll try .arun()! Thanks!
Add a reply
Sign up and join the conversation on Discord