----------------------
The extracts provided do not contain specific information on how to make asynchronous calls using
RAGStringQueryEngine
. However, there is an example of making asynchronous calls with a list of queries using a generic
query_engine
in LlamaIndex. Here's the example:
import asyncio
import time
# a list of different queries
query_list = [query_str] * 3
start_time = time.perf_counter()
query_engine = index.as_query_engine(
response_mode="tree_summarize",
)
# run each query in parallel
async def async_query(query_engine, questions):
tasks = [query_engine.aquery(q) for q in questions]
r = await asyncio.gather(*tasks)
return r
_ = asyncio.run(async_query(query_engine, query_list))
elapsed_time = time.perf_counter() - start_time
print(f"{elapsed_time:0.3f}s")
(Source:
Async Query Demo ())
In this example,
asyncio.gather
is used to run multiple queries in parallel. You can replace
query_engine
with an instance of
RAGStringQueryEngine
and use the
aquery
method for asynchronous calls. Please note that this is a general approach and might need adjustments based on the specific requirements of
RAGStringQueryEngine
.
For more detailed information, you may want to refer to the
LlamaIndex documentation () or the
LlamaIndex GitHub repository ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------