The community member is experiencing rate limit errors when generating OpenAI embeddings in an ingestion pipeline. The retry logic does not seem to work properly, and the requests eventually fail after 6 tries. The community member is looking for ways to avoid these rate limit errors and track the number of tokens being sent in the requests to properly rate limit from their app.
In the comments, other community members suggest trying to increase the batch size, with one providing an example of using a batch size of 50 with the OpenAIEmbedding class. Another community member mentions trying batch sizes of 10, 100, and 200, but it's unclear if this resolved the issue.
There is no explicitly marked answer in the provided information.
How do I avoid rate limit errors when generating openai embeddings in an ingestion pipeline? The retry logic doesn’t seem to work properly as it eventually just fails after 6 tries. How can I track how many tokens are actually being sent in requests to properly rate limit from my app?