Is there a way to call AzureOpenAI in async way ? Curre...

At a glance

The community members discuss how to use the AzureOpenAI class from the llama_index library in an asynchronous way. The main points are:

1. The community members provide examples of using the acomplete and astream_complete methods for asynchronous function calls with AzureOpenAI for text completion, and the achat and astream_chat methods for asynchronous function calls in a chat context.

2. The community members explain that using async methods can improve performance by allowing tasks to be executed in parallel without blocking the execution of the rest of the program, especially for I/O bound tasks like network requests.

3. The community members suggest that using async in the LlamaIndex library can lead to significant performance improvements, such as a 2x speedup in the query pipeline and faster ingestion for larger workloads.

There is no explicitly marked answer to the final question about how async helps in the specific case of await llm.achat(message).

ppraveenks30#120294

@kapa.ai Is there a way to call AzureOpenAI in async way ? Currently, I am using AzureOpenAI from llama_index.llms.azure.openai

9 comments

kkapa.ai

Please stand by @praveenks30#120294 as I search the knowledge base. I normally take about 20 seconds or so. :wave:

kkapa.ai

----------------------
Yes, you can call AzureOpenAI in an async way. The acomplete and astream_complete methods are used for async function calling. Here is an example of how you can use these methods:

Plain Text

from llama_index.llms.azure_openai import AzureOpenAI

llm = AzureOpenAI(model="YOUR_AZURE_OPENAI_COMPLETION_MODEL_NAME",
                  deployment_name="YOUR_AZURE_OPENAI_COMPLETION_DEPLOYMENT_NAME",
                  api_key="YOUR_AZURE_OPENAI_API_KEY",
                  azure_endpoint="YOUR_AZURE_OPENAI_ENDPOINT",
                  api_version="2023-07-01-preview")

resp = await llm.acomplete("Paul Graham is ")
print(resp)

resp = await llm.astream_complete("Paul Graham is ")
async for delta in resp:
    print(delta.delta, end="")

Please replace the placeholders with your actual values.

You can find more details in the LlamaIndex documentation ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

ppraveenks30#120294

@kapa.ai will it work with chat as well ?

kkapa.ai

Let me check the knowledge base for you. One moment! :books:

kkapa.ai

----------------------
Yes, you can use async methods with chat as well. The achat and astream_chat methods are used for async function calling in a chat context. Here is an example of how you can use these methods:

Plain Text

from llama_index.llms.azure_openai import AzureOpenAI
from llama_index.core.llms import ChatMessage

llm = AzureOpenAI(model="YOUR_AZURE_OPENAI_COMPLETION_MODEL_NAME",
                  deployment_name="YOUR_AZURE_OPENAI_COMPLETION_DEPLOYMENT_NAME",
                  api_key="YOUR_AZURE_OPENAI_API_KEY",
                  azure_endpoint="YOUR_AZURE_OPENAI_ENDPOINT",
                  api_version="2023-07-01-preview")

messages = [
    ChatMessage(role="system", content="You are a helpful assistant."),
    ChatMessage(role="user", content="Who won the world series?"),
]

resp = await llm.achat(messages)
print(resp)

resp = await llm.astream_chat(messages)
async for delta in resp:
    print(delta.delta, end="")

ppraveenks30#120294

@kapa.ai How does async help here ?

kkapa.ai

Let me check the knowledge base for you. One moment! :books:

kkapa.ai

----------------------
Async, short for asynchronous, allows tasks to be executed without blocking the execution of the rest of the program. This means that while one task is waiting for a response (for example, from an API call), other tasks can continue to execute. This can lead to significant performance improvements, especially in I/O bound tasks such as network requests.

In the context of LlamaIndex, async can be used to speed up the execution of tasks. For example, in the query pipeline (), async is used to send queries to multiple RAG query engines at once and combine the results. This can lead to a 2x speedup compared to synchronous execution, as shown in the example ().

In the ingestion pipeline (), async with parallel processing is also shown to be faster than synchronous execution. The speedup can be even greater for larger workloads.

So, in summary, async can help to improve the performance of LlamaIndex by allowing tasks to be executed in parallel and without blocking the execution of the rest of the program.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

ppraveenks30#120294

@kapa.ai how does async help here await llm.achat(message)

Add a reply

Find answers from the community

Is there a way to call AzureOpenAI in async way ? Currently, I am using AzureOpenAI from l