Find answers from the community

Updated 5 months ago

Is is possible to make, say, 10 queries

Is is possible to make, say, 10 queries to a LLM and put them in a single batch to get faster results? I know I can do asynchronous querying, but this is different. I would like to do both. Thanks.
L
e
4 comments
nope, only async right now
Thank you Logan. I must say, given the level of generality of LlamaIndex, and the number of features it offers, it is rather surprising to me. Batching seems like a relatively simple performance win. In fact, batching and async could be combined. Thanks again.
Ok, I understand the challenges invovled with batching. Are you guys working on it? I realize it could take some time.
Not currently working on it. Most APIs either don't support batching, or batch internally based on incoming requests. None of our LLM interfaces or things that work with LLMs expect batching either, which also means a lot of work to even make it useful

Since async works just fine, its lower priority at the moment
Add a reply
Sign up and join the conversation on Discord