Find answers from the community

Updated 3 months ago

How do I avoid rate limit errors when

How do I avoid rate limit errors when generating openai embeddings in an ingestion pipeline? The retry logic doesn’t seem to work properly as it eventually just fails after 6 tries. How can I track how many tokens are actually being sent in requests to properly rate limit from my app?
W
s
3 comments
You could try increasing the batch size, I think default is 10
Plain Text
from llama_index.embeddings.openai import OpenAIEmbedding
OpenAIEmbedding(embed_batch_size=50,...)
Have tried batch sizes of 10 100 and 200
Add a reply
Sign up and join the conversation on Discord