Find answers from the community

Updated 3 months ago

I'm trying to run the following code,

I'm trying to run the following code, and it's only using 1 of 12 CPU cores while reading the PDF files from this directory. Is there a way to have SimpleDirectoryReader use multiprocessing or something to read in and parse multiple files at once?

from llama_index.core import SimpleDirectoryReader reader = SimpleDirectoryReader( input_dir="/home/ovo/code/datasets/ebooks/compsci/" ) docs = reader.load_data() print(f"Loaded {len(docs)} docs")
W
๏ฟฝ
2 comments
Pass the num_workers in load_data()

docs = reader.load_data(num_workers=10)
Add a reply
Sign up and join the conversation on Discord