Find answers from the community

Updated 2 months ago

Ingestion

Do yall have any tips on improving file ingestion speed, only using node parser and embeddings but large files are still quite slow
W
L
m
14 comments
also, you can increase the batch size on embeddings (especially if you are using api-based embeddings like openai)
Can node_processor be parallelized?
Ill increase that for sure I think node processor is the slow thing right now
It can be, using the above example πŸ‘
I don't use an ingestion pipeline as it doesn't work for some reason lol, can I just provide num_workers
nope, because we can't multiprocess that low-level (too many un-picklable errors)
I can help you setup an ingestion pipeline
it should be fairly easy
Will ingestion pipeline being parrellelized help then if all I'm doing is node parser and embedding if node parser can't be parrellelized?
Each step in an ingestion pipeline can be parallelized
including the node parser
(but it cant happen directly in the node parser, long story)
Try it out, it will make sense hopefully πŸ˜…
Add a reply
Sign up and join the conversation on Discord