Find answers from the community

Updated 4 months ago

is there a way to make the process of

At a glance

The community member is asking if there is a way to parallelize the process of indexing documents using Chroma DB, as the current process is only using one GPU. A comment suggests using the environment variable CUDA_VISIBLE_DEVICES to specify all available GPUs, and also provides a link to the Llama Index documentation on parallel ingestion, which may help with the parallel processing.

Useful resources
is there a way to make the process of indexing documents parallele ?
for example here the chroma_db creation process is using only one gpu for me, I was wondering if there was an option to make it use both my gpus ?

the process is just loading documents that I have already parsed and indexing them using huggingface embeddings and saving it to a chromadb
Attachment
Screenshot_from_2024-07-10_10-03-04.png
W
1 comment
Maybe adding this can help it use all the GPU:
os.environ["CUDA_VISIBLE_DEVICES"]="0,1,2,3"

Numbers being the number of GPU you have

Also you can follow this to have parallel ingestion: https://docs.llamaindex.ai/en/stable/module_guides/loading/ingestion_pipeline/?h=#parallel-processing
Add a reply
Sign up and join the conversation on Discord