rdx99.

Hey ,

Hey ,
Getting an error after many warnings like below when using IngestionPipeline with parallelization, transformations use hierarchical node parser :

Plain Text

create_hierarchical_index_qdrant Creating pipeline
WARNINGS /python3.11/site-packages/llama_index/core/schema.py:94:__getstate__ Removing unpickleable private attribute _chunking_tokenizer_fn

1 comment

rrdx99.

Is there an issue in IngestionPipeline

Is there an issue in IngestionPipeline when using parallel processing mode?

Plain Text

        transformations = [
            HierarchicalNodeParser.from_defaults(chunk_sizes=[4096, 2048]),
            Settings.embed_model,
        ]
        logger.info("Creating pipeline")
        pipeline = IngestionPipeline(transformations=transformations)
        # pipeline.disable_cache = False
        logger.info('Num workers: ' + str(os.cpu_count()))

        nodes = pipeline.run(
            documents=createHierarchicalIndexRequest.Documents,
            num_workers=4,
        )

My pipeline doesn't return any err messages nor executes further after pipeline.run() call. If I remove num_workers arg it runs but its extremely slow, any advice?

Find answers from the community

Hey ,

Is there an issue in IngestionPipeline

I tried some things but not working ,