How do I get past rate limits errors. I

At a glance

The community member is experiencing rate limit errors with their chat application that uses llama index. They are getting a warning about retrying the completion with retry function due to a RateLimitError from the OpenAI GPT-3.5-turbo model. The community member is asking if there is a way to load balance to get past these rate limit issues.

In the comments, another community member suggests trying to set a lower token limit for the chat memory buffer, which may help. Another community member recommends using the 16k model, which should have double the rate limit, and also suggests applying for a rate limit increase from OpenAI. The original community member thanks the others and says the chat memory setting seems to be working for now, and they will also try the 16k model later.

Useful resources

ttoniuyt

How do I get past rate limits errors. I have a chat application that uses llama index and I often get this

Plain Text

WARNING:llama_index.llms.openai_utils:Retrying llama_index.llms.openai_utils.completion_with_retry.<locals>._completion_with_retry in 8.0 seconds as it raised RateLimitError: Rate limit reached for gpt-3.
5-turbo in organization org-rfaUzt0VkU7EjbD5nJEEz7yh on tokens per min. Limit: 90000 / min. Current: 86359 / min. Contact us through our help center at help.openai.com if you continue to have issues..

Is there a way load balance?

3 comments

EEmanuel Ferreira

It's when creating the index or during the conversations?

maybe you can try to set the token_limit send to the AI

Plain Text

memory = ChatMemoryBuffer.from_defaults(token_limit=1500)

a few more details can help as well

TTeemu

If you want an instant improvement, consider using 16k. It should have double the rate limit. You can also apply for a rate limit increase from OpenAI https://platform.openai.com/docs/guides/rate-limits/overview

ttoniuyt

Thanks, the chat memory seems to be working for the moment. I'll also try the 16k later

Add a reply

Find answers from the community

How do I get past rate limits errors. I