How do I avoid rate limit errors when generating openai embeddings in an ingestion pipeline? The retry logic doesn’t seem to work properly as it eventually just fails after 6 tries. How can I track how many tokens are actually being sent in requests to properly rate limit from my app?