Find answers from the community

Updated 3 months ago

Nest

Hey, y'all! I ran into an issue using the SimpleDirectoryReader with the TitleExtractor transformation and I wanted to run it by y'all before raising a GitHub issue.

The async calls in the method are fighting with the asyncio call in IngestionPipeline.run and throwing this error:

Plain Text
python 
def asyncio_run(coro: Coroutine) -> Any:
    """Gets an existing event loop to run the coroutine.

    If there is no existing event loop, creates a new one.
    """
    try:
        loop = asyncio.get_running_loop()
        if loop.is_running():
            raise RuntimeError(
                "Nested async detected. "
                "Use async functions where possible (`aquery`, `aretrieve`, `arun`, etc.). "
                "Otherwise, use `import nest_asyncio; nest_asyncio.apply()` "
                "to enable nested async or use in a jupyter notebook.\n\n"
                "If you are experiencing while using async functions and not in a notebook, "
                "please raise an issue on github, as it indicates a bad design pattern."
            )
        else:
            return loop.run_until_complete(coro)
    except RuntimeError:
        return asyncio.run(coro)


I edited my cache file to just use nest_asyncio.apply() here instead of throwing the error, but that's kinda counterintuitive for the long term.
Have y'all
a) seen this before?
b) found a decent workaround?

If there's a different transform that I could use to acheive the same thing without running up against that error, that'd be great (I haven't found one)
L
H
5 comments
Applying nest asyncio is pretty standard, especially if you are running in a notebook

If you want to avoid that, I recommend using actual async entry points like await pipeline.arun(..)
I wasn't running in a notebook. We have an app hosted in Azure we're using to ingest content from different sources. I'll try .arun()! Thanks!
Add a reply
Sign up and join the conversation on Discord