Log in
Log into community
Find answers from the community
View all posts
Related posts
Did this answer your question?
π
π
π
Powered by
Hall
Inactive
Updated 3 months ago
0
Follow
Indexing/Embedding question. How can I
Indexing/Embedding question. How can I
Inactive
0
Follow
e
edk
last year
Β·
Indexing/Embedding question. How can I speed up the embedding/indexing process of llama index. I have possibly thousands of documents I want to index as fast as possible. Any help is appreciated.
T
e
L
6 comments
Share
Open in Discord
T
Teemu
last year
Are you using the OpenAI API? You can increase the batch size
https://gpt-index.readthedocs.io/en/latest/module_guides/models/embeddings.html
e
edk
last year
@Teemu much appreciated. It really helped. What's the most/fastest batch size I can request if you happen to know
T
Teemu
last year
It will depend on your personal rate limits, I probably have different ones so it's hard to say. You can try higher ones and see what happens
T
Teemu
last year
Also if I'm not mistaken, they recently upped the rate limits (at least mine). Not sure if the standard batch size reflects that change yet π€
e
edk
last year
Many thanks. By the way, do you know if I can multiprocess/distribute indexing of documents or is it inherently a synchronous process
L
Logan M
last year
I think the max batch size for OpenAI is 2048
you should be able set
use_async=True
in the index constructor to also help speed up requests
Add a reply
Sign up and join the conversation on Discord
Join on Discord