How do I avoid rate limit errors when

At a glance

The community member is experiencing rate limit errors when generating OpenAI embeddings in an ingestion pipeline. The retry logic does not seem to work properly, and the requests eventually fail after 6 tries. The community member is looking for ways to avoid these rate limit errors and track the number of tokens being sent in the requests to properly rate limit from their app.

In the comments, other community members suggest trying to increase the batch size, with one providing an example of using a batch size of 50 with the OpenAIEmbedding class. Another community member mentions trying batch sizes of 10, 100, and 200, but it's unclear if this resolved the issue.

There is no explicitly marked answer in the provided information.

ssnackbar

How do I avoid rate limit errors when generating openai embeddings in an ingestion pipeline? The retry logic doesn’t seem to work properly as it eventually just fails after 6 tries. How can I track how many tokens are actually being sent in requests to properly rate limit from my app?

3 comments

WWhiteFang_Jr

You could try increasing the batch size, I think default is 10

WWhiteFang_Jr

Plain Text

from llama_index.embeddings.openai import OpenAIEmbedding
OpenAIEmbedding(embed_batch_size=50,...)

ssnackbar

Have tried batch sizes of 10 100 and 200

Add a reply

Find answers from the community

How do I avoid rate limit errors when