Log in
Log into community
Find answers from the community
View all posts
Related posts
Did this answer your question?
π
π
π
Powered by
Hall
Inactive
Updated 2 months ago
0
Follow
Ingestion
Ingestion
Inactive
0
Follow
m
maybe goats dont exist
10 months ago
Β·
Do yall have any tips on improving file ingestion speed, only using node parser and embeddings but large files are still quite slow
W
L
m
14 comments
Share
Open in Discord
W
WhiteFang_Jr
10 months ago
You could try parallel ingestion:
https://docs.llamaindex.ai/en/stable/module_guides/loading/ingestion_pipeline/root.html#parallel-processing
L
Logan M
10 months ago
also, you can increase the batch size on embeddings (especially if you are using api-based embeddings like openai)
m
maybe goats dont exist
10 months ago
Can node_processor be parallelized?
m
maybe goats dont exist
10 months ago
Ill increase that for sure I think node processor is the slow thing right now
L
Logan M
10 months ago
It can be, using the above example π
m
maybe goats dont exist
10 months ago
I don't use an ingestion pipeline as it doesn't work for some reason lol, can I just provide num_workers
L
Logan M
10 months ago
nope, because we can't multiprocess that low-level (too many un-picklable errors)
L
Logan M
10 months ago
I can help you setup an ingestion pipeline
L
Logan M
10 months ago
it should be fairly easy
m
maybe goats dont exist
10 months ago
Will ingestion pipeline being parrellelized help then if all I'm doing is node parser and embedding if node parser can't be parrellelized?
L
Logan M
10 months ago
Each step in an ingestion pipeline can be parallelized
L
Logan M
10 months ago
including the node parser
L
Logan M
10 months ago
(but it cant happen directly in the node parser, long story)
L
Logan M
10 months ago
Try it out, it will make sense hopefully π
Add a reply
Sign up and join the conversation on Discord
Join on Discord