Find answers from the community

Updated 5 months ago

Hello I have a quick question on llama

At a glance

Hello, I have a quick question: on llama-index 0.4.14, does GPTSimpleVectorIndex call the embedding asynchronously by default?

8 comments

LLogan M

Not by default no. use the option use_async=True when creating the index

jjerryjliu0

also for query-time, we support it only if you set response_mode="tree_summarize", use_async=True

mmat

sweet thanks! also, am i correct in thinking that async LLM calls during query() won't speed things up? since generating and refining a response requires sequential calls to the LLM?e

jjerryjliu0

it will! if you set response_mode="tree_summarize". you're right in that default create_and_refine won't

yyoelk

Hey Logan. Question on that - is there a reason not to use the async functionality?

LLogan M

@yoelk hmmm, the only disadvantage I can think of is async might ping the embedding model a lot in a short amount of time, which can sometimes cause rate limit errors and other things related to load.

Maybe I'm missing something else though - I was kind of surprised it wasn't turned on by default tbh

jjerryjliu0

it probably should be turned on by default! main thing is all jupyter nb examples would break since you need nest_asyncio for async to work

Add a reply