Hello. I don't manage to obtain parallel

At a glance

Hello. I don't manage to obtain parallel call while summarizing documents:

Plain Text

import nest_asyncio
nest_asyncio.apply()

...

service_context = ServiceContext.from_defaults(llm=llm, chunk_size=1024)
response_synthesizer = get_response_synthesizer(
    use_async=True,
    response_mode="tree_summarize",
    summary_template=PromptTemplate(custom_tmpl)
)
doc_summary_index = DocumentSummaryIndex.from_documents(
    tables,
    service_context=service_context,
    response_synthesizer=response_synthesizer,
    show_progress=True,
    use_async=True,
)

Logs show that each element in tables is summarized one by one, waiting for the previous to be completed, no parallelism at all. Am I missing something? I didn't find a doc specifying how to parameterize the level of parallelism (how many call to an API in parallel for example?)

7 comments

LLogan M

what LLM are you using?

LLogan M

Not every LLM supports real async 😅

LLogan M

oh wait -- I think that's actually correct behaviour

LLogan M

The actuall summarization of each document is happening concurrently (i.e. each pair of chunks per document is summarized during tree summarize concurrently)

But per-document level is sequential right now

LLogan M

Very open to a PR to improve that

LLogan M

I think it's as simple as spinning up async methods/jobs per document, and then using async_utils.run_jobs to limit the number of concurrant jobs and avoid rate limits

ppaull

Ok, I see makes sense then. Thanks for quick answer 👍

Add a reply

Find answers from the community

Hello. I don't manage to obtain parallel