Find answers from the community

Updated 2 months ago

How do I get past rate limits errors. I

How do I get past rate limits errors. I have a chat application that uses llama index and I often get this
Plain Text
WARNING:llama_index.llms.openai_utils:Retrying llama_index.llms.openai_utils.completion_with_retry.<locals>._completion_with_retry in 8.0 seconds as it raised RateLimitError: Rate limit reached for gpt-3.
5-turbo in organization org-rfaUzt0VkU7EjbD5nJEEz7yh on tokens per min. Limit: 90000 / min. Current: 86359 / min. Contact us through our help center at help.openai.com if you continue to have issues.. 

Is there a way load balance?
E
T
t
3 comments
It's when creating the index or during the conversations?

maybe you can try to set the token_limit send to the AI
Plain Text
memory = ChatMemoryBuffer.from_defaults(token_limit=1500)

a few more details can help as well
If you want an instant improvement, consider using 16k. It should have double the rate limit. You can also apply for a rate limit increase from OpenAI https://platform.openai.com/docs/guides/rate-limits/overview
Thanks, the chat memory seems to be working for the moment. I'll also try the 16k later
Add a reply
Sign up and join the conversation on Discord