Find answers from the community

Updated 6 months ago

Reduce

At a glance

The community members are discussing how to reduce the number of requests made to the OpenAI API. One community member suggests trying to increase the embedding batch size, as the default is 10. Another community member provides some code examples to set the batch size to 100 and update the service context accordingly. However, the community member notes that the changes are not working as expected. The community members continue to discuss potential solutions, but there is no explicitly marked answer.

Useful resources
how I reduce the number of requests made to OpenAI api?
L
M
7 comments
Are you hitting rate limits?

Probably for emebddings. You could try lowering the embedding batch size (default is 10)

https://gpt-index.readthedocs.io/en/latest/core_modules/model_modules/embeddings/usage_pattern.html#batch-size
You mean increase the bach size, no? I need to reduce the number of requests made to openai.
Right whoops, was confusing token limits with rate limits lol
from llama_index.embeddings import OpenAIEmbedding

embed_model = OpenAIEmbedding()
service_context = ServiceContext.from_defaults(embed_model=embed_model)

embed_model = OpenAIEmbedding(embed_batch_size=100)
Do i just need to add this to my code or am i forgetting anything? Because its not working
you'll have to set it in the service context after changing the batch size
probably easier to set a global context too

Plain Text
from llama_index import set_global_service_context
embed_model = OpenAIEmbedding(embed_batch_size=100)
service_context = ServiceContext.from_defaults(embed_model=embed_model)

set_global_service_context(service_context)
Add a reply
Sign up and join the conversation on Discord