Log in
Log into community
Find answers from the community
View all posts
Related posts
Was this helpful?
π
π
π
Powered by
Hall
Inactive
Updated 4 months ago
0
Follow
Ingestion
Ingestion
Inactive
0
Follow
At a glance
m
maybe goats dont exist
last year
Β·
Do yall have any tips on improving file ingestion speed, only using node parser and embeddings but large files are still quite slow
W
L
m
14 comments
Share
Open in Discord
W
WhiteFang_Jr
last year
You could try parallel ingestion:
https://docs.llamaindex.ai/en/stable/module_guides/loading/ingestion_pipeline/root.html#parallel-processing
L
Logan M
last year
also, you can increase the batch size on embeddings (especially if you are using api-based embeddings like openai)
m
maybe goats dont exist
last year
Can node_processor be parallelized?
m
maybe goats dont exist
last year
Ill increase that for sure I think node processor is the slow thing right now
L
Logan M
last year
It can be, using the above example π
m
maybe goats dont exist
last year
I don't use an ingestion pipeline as it doesn't work for some reason lol, can I just provide num_workers
L
Logan M
last year
nope, because we can't multiprocess that low-level (too many un-picklable errors)
L
Logan M
last year
I can help you setup an ingestion pipeline
L
Logan M
last year
it should be fairly easy
m
maybe goats dont exist
last year
Will ingestion pipeline being parrellelized help then if all I'm doing is node parser and embedding if node parser can't be parrellelized?
L
Logan M
last year
Each step in an ingestion pipeline can be parallelized
L
Logan M
last year
including the node parser
L
Logan M
last year
(but it cant happen directly in the node parser, long story)
L
Logan M
last year
Try it out, it will make sense hopefully π
Add a reply
Sign up and join the conversation on Discord
Join on Discord