Find answers from the community

Updated 3 months ago

Reduce

how I reduce the number of requests made to OpenAI api?
L
M
7 comments
Are you hitting rate limits?

Probably for emebddings. You could try lowering the embedding batch size (default is 10)

https://gpt-index.readthedocs.io/en/latest/core_modules/model_modules/embeddings/usage_pattern.html#batch-size
You mean increase the bach size, no? I need to reduce the number of requests made to openai.
Right whoops, was confusing token limits with rate limits lol
from llama_index.embeddings import OpenAIEmbedding

embed_model = OpenAIEmbedding()
service_context = ServiceContext.from_defaults(embed_model=embed_model)

embed_model = OpenAIEmbedding(embed_batch_size=100)
Do i just need to add this to my code or am i forgetting anything? Because its not working
you'll have to set it in the service context after changing the batch size
probably easier to set a global context too

Plain Text
from llama_index import set_global_service_context
embed_model = OpenAIEmbedding(embed_batch_size=100)
service_context = ServiceContext.from_defaults(embed_model=embed_model)

set_global_service_context(service_context)
Add a reply
Sign up and join the conversation on Discord