Find answers from the community

Updated 4 months ago

I am building an index with 300k

At a glance

BBar Haim

I am building an index with 300k documents, is it possible to see the progress of building? like tqdm metric?

16 comments

BBar Haim

moreover, an ideas for optimization the running time? i use model_name=BAAI/bge-small-en can it run on GPU?

EEmanuel Ferreira

Plain Text

index = VectorStoreIndex.from_documents(documents, show_progress=True)

EEmanuel Ferreira

About optimization

https://gpt-index.readthedocs.io/en/latest/examples/llm/llama_2_llama_cpp.html#installation

BBar Haim

Thanks

EEmanuel Ferreira

ops

EEmanuel Ferreira

https://gpt-index.readthedocs.io/en/stable/getting_started/installation.html#local-environment-setup

EEmanuel Ferreira

this one as well

BBar Haim

Attachment

BBar Haim

i don't find in the docs how to improve the embeddings speed

BBar Haim

i am using the local embedding model - Could not load OpenAIEmbedding. Using HuggingFaceBgeEmbeddings with model_name=BAAI/bge-small-en. If you intended to use OpenAI, please check your OPENAI_API_KEY.

BBar Haim

i mean, how can I run the BAAI/bge-small-en faster? what if I add a GPU to my mahcine

EEmanuel Ferreira

maybe @Logan M can jump in here

You can try with a better hardware, but the embedding it self doesn't have other solutions to improve speed, mainly this Langchain one, with OpenAIEmbeddings you would increase the batch size

LLogan M

GPU will definitely help here

LLogan M

I think if you have cuda installed, it will automatically run on gpu too

BBar Haim

oh nice, i'll try on cuda gpu

EEmanuel Ferreira

Yo @Bar Haim

Maybe it can interest you as well

https://x.com/jerryjliu0/status/1706097203486629889?s=20

Add a reply