Find answers from the community

Updated 8 months ago

Hi! I have a list of prompts + documents

At a glance
Hi! I have a list of prompts + documents (they are not related to each other). Is there a way to request OpenAI in parallel to speed things up?
L
p
15 comments
use async πŸ™
@Logan M Sorry, could you send a link to a proper documentation? The docs I've seen all use vector stores, whereas I just need to make async LLM calls (I don't need retrieval)
so I need to use llm.acomplete instead of llm.complete ?
yes

For example

Plain Text
calls = []
for prompt in prompts:
  calls.append(llm.acomplete(prompt))

results = await asyncio.gather(*calls)
Wew! I need to learn more about asyncio. Thanks!

Btw, I saw AsyncOpenAI class, do I need to use it as an llm or normal OpenAI is fine? If OpenAI is fine, then why do we need AsyncOpenAI ?
AsyncOpenAI is just the async client, llama-index uses it automatically under the hood for async calls
Hi @Logan M ! Is there a way to embed a list of texts the same way?

I've tried:

Plain Text
embeddings = []

for text in texts:
  embeddings.append(Settings.embed_model.aget_text_embedding(text)
results = await asyncio.gather(*embeddings)


and this not seem to work
how does it not work? πŸ‘€
It works for me

Plain Text
import asyncio
from llama_index.embeddings.openai import OpenAIEmbedding

async def embed():
  texts = ['one', 'two', 'three']

  embed_model = OpenAIEmbedding()

  jobs = []
  for text in texts:
    jobs.append(embed_model.aget_text_embedding(text))
  embeddings = await asyncio.gather(*jobs)

  # max batch size is 2048 with openai
  embed_model = OpenAIEmbedding(embed_batch_size=2000)

  embeddings = await embed_model.aget_text_embedding_batch(texts)
  return embeddings

asyncio.run(embed())
There's also the batch method, which will be faster than embedding one text at a time
@Logan M Oh... Maybe I did something wrong then... Which method is the fastest?

Plain Text
embeddings = embed_model.get_text_embedding_batch(texts)

?

There is also aget_text_embedding_batch(texts) (with a prefix). How to use that one? And is it faster than a simple get_text_embedding_batch ?
It's much faster to use the batch method in most embedding models (especially openai)
I used aget_text_embedding_batch above
@Logan M Oh I see thanks!
Add a reply
Sign up and join the conversation on Discord