The community member is looking to use the OpenAI batch API to get embeddings in bulk and is wondering if it's feasible to pass these embeddings directly to llama-index. The comments suggest that llama-index does not integrate with the batch mode, as it is built around mostly real-time applications. However, one community member suggests that batching could be useful for building an index with many documents. Another community member notes that the batch API can take up to 24 hours to run, so the community member would need to provide some way to schedule the embeddings and then use them for ingestion. The community member is interested in using the OpenAI client to create the batch and attach the embeddings to their nodes, but is unsure how to do this.
Hey everyone. I'm looking to use the openai batch api (half price) to get embeddings in bulk. Is it feasible to pass embeddings directly to llamaindex, and would anyone be able to point me in a good direction for this?
The batch api can take up to 24 hours to run π So you'd need to provide some weird api to schedule the embeddings, and then use those embeddings for ingestion
You can use the openai client itself to do this though, and just attach the embeddings to your nodes and insert them when the embeddings become available
thanks. yeah, the second piece of what you mentioned was what i was interested to do. to create the batch through openai client. but i wasn't sure how to attach insert as embeddings (as opposed to text)