Find answers from the community

Updated 2 months ago

How can I do parallel processing on

How can I do parallel processing on IngestionPipelines?

My conversations have as many as 200 documents with as many as 800 pages, so I need to preprocess data before my customers can start a conversation.

I’ve scoured the docs/code, but haven’t found a way to run multiple pipeline calls at once. I’m currently using asyncio.gather on documents and then pages, call pipeline.arun for each page, but my results still appear to be sequential…

Plain Text
Processed 6 documents in 130.94 seconds
Total number of pages processed: 6
Average time per document: 21.82 seconds
Average time per page: 21.50 seconds
Doc 4 took 16.62 seconds
  Page 1 took 14.89 seconds
Doc 2 took 39.05 seconds
  Page 1 took 38.35 seconds
Doc 6 took 38.55 seconds
  Page 1 took 37.80 seconds
Doc 5 took 93.89 seconds
  Page 1 took 85.75 seconds
Doc 1 took 129.99 seconds
  Page 1 took 128.76 seconds
Doc 3 took 130.94 seconds
  Page 1 took 129.01 seconds


If this test conversation of 6 docs / 6 pages (all small text) took ~20 seconds per page, then the entire job should take ~20 seconds, right? Any recs on how to make this work?
J
L
23 comments
tryna make these scream
Attachment
image.png
Async is more about concurrency i.e. let several api calls go out at once

If you aren't using api-based embeddings or LLMs, you probably won't notice any speedup with async.

If you are using api-based models, try increasing the num_workers kwarg on any metadata extractors
Would love to test the PR! Is it release yet?
P.S. I had num_workers=8 set for the above runs
Its merged into main, I am about to cut a release 🙂
Ready when you are!
v0.9.29 is out 😉
Is it still deploying?

Plain Text
(llama-app-backend-py3.11) joshuasabol@Joshuas-MacBook-Pro-2 backend % poetry add llama-index@0.9.29

Could not find a matching version of package llama-index
It might be -- lemme check
hmmm got an error on publish I see
Reading https://github.com/run-llama/llama_index/pull/9920

NOTE: I didn't encounter the cannot pickle CoreBPE error. It seems that moving parallelization up to the run method has resulted in not needing to pickle lower-level imports. If we had defined tokenizer here directly like we have in SentenceSplitter, then that's when we'd see the error and need the fix. The same goes for the partial fix for lambda funcs not being pickle-able — that fix is no longer necessary here.

Does this mean SentenceSplitter is not parallelizable? I'm using it in my pipeline
No it is parallelizable -- since it is splitting jobs one level higher, it seems to work
ok, now it published lol
I keep getting this error: cannot pickle '_asyncio.Task' object
Can you share a code sample that reproduces that?
Still trying to figure out why it's not working within my chat app, but I got it working via a script, and WOW this is going to be a huge time saver

TYSM @Logan M , @jerryjliu0 , et al
Attachment
image.png
WOW thats an increase haha amazing
We made em scream alright haha
Attachment
image.png
@Logan M -- jw, do any of the PDF loaders (ex. PDFReader) support parallel processing?
Add a reply
Sign up and join the conversation on Discord