Find answers from the community

Updated 2 months ago

Hello I have a quick question on llama

Hello, I have a quick question: on llama-index 0.4.14, does GPTSimpleVectorIndex call the embedding asynchronously by default?
1
L
j
m
8 comments
Not by default no. use the option use_async=True when creating the index
also for query-time, we support it only if you set response_mode="tree_summarize", use_async=True
sweet thanks! also, am i correct in thinking that async LLM calls during query() won't speed things up? since generating and refining a response requires sequential calls to the LLM?e
it will! if you set response_mode="tree_summarize". you're right in that default create_and_refine won't
Hey Logan. Question on that - is there a reason not to use the async functionality?
@yoelk hmmm, the only disadvantage I can think of is async might ping the embedding model a lot in a short amount of time, which can sometimes cause rate limit errors and other things related to load.

Maybe I'm missing something else though - I was kind of surprised it wasn't turned on by default tbh
it probably should be turned on by default! main thing is all jupyter nb examples would break since you need nest_asyncio for async to work
Add a reply
Sign up and join the conversation on Discord