Hey everyone. I'm looking to use the

At a glance

The community member is looking to use the OpenAI batch API to get embeddings in bulk and is wondering if it's feasible to pass these embeddings directly to llama-index. The comments suggest that llama-index does not integrate with the batch mode, as it is built around mostly real-time applications. However, one community member suggests that batching could be useful for building an index with many documents. Another community member notes that the batch API can take up to 24 hours to run, so the community member would need to provide some way to schedule the embeddings and then use them for ingestion. The community member is interested in using the OpenAI client to create the batch and attach the embeddings to their nodes, but is unsure how to do this.

ddjl0.

Hey everyone. I'm looking to use the openai batch api (half price) to get embeddings in bulk. Is it feasible to pass embeddings directly to llamaindex, and would anyone be able to point me in a good direction for this?

4 comments

LLogan M

llama-index does not integrate with the batch mode (it didn't really make sense to do, llama-index is built around mostly real-time applications)

ddjl0.

If I'm persisting an index with many documents, wouldn't that be a decent usecase for batching in the index-building step (from_documents)?

Maybe I'm misunderstanding the ingestion step

LLogan M

The batch api can take up to 24 hours to run 😅 So you'd need to provide some weird api to schedule the embeddings, and then use those embeddings for ingestion

You can use the openai client itself to do this though, and just attach the embeddings to your nodes and insert them when the embeddings become available

ddjl0.

thanks. yeah, the second piece of what you mentioned was what i was interested to do. to create the batch through openai client. but i wasn't sure how to attach insert as embeddings (as opposed to text)

Add a reply

Find answers from the community

Hey everyone. I'm looking to use the