Find answers from the community

Updated 2 months ago

Hello. I don't manage to obtain parallel

Hello. I don't manage to obtain parallel call while summarizing documents:
Plain Text
import nest_asyncio
nest_asyncio.apply()

...

service_context = ServiceContext.from_defaults(llm=llm, chunk_size=1024)
response_synthesizer = get_response_synthesizer(
    use_async=True,
    response_mode="tree_summarize",
    summary_template=PromptTemplate(custom_tmpl)
)
doc_summary_index = DocumentSummaryIndex.from_documents(
    tables,
    service_context=service_context,
    response_synthesizer=response_synthesizer,
    show_progress=True,
    use_async=True,
)

Logs show that each element in tables is summarized one by one, waiting for the previous to be completed, no parallelism at all. Am I missing something? I didn't find a doc specifying how to parameterize the level of parallelism (how many call to an API in parallel for example?)
L
p
7 comments
what LLM are you using?
Not every LLM supports real async πŸ˜…
oh wait -- I think that's actually correct behaviour
The actuall summarization of each document is happening concurrently (i.e. each pair of chunks per document is summarized during tree summarize concurrently)

But per-document level is sequential right now
Very open to a PR to improve that
I think it's as simple as spinning up async methods/jobs per document, and then using async_utils.run_jobs to limit the number of concurrant jobs and avoid rate limits
Ok, I see makes sense then. Thanks for quick answer πŸ‘
Add a reply
Sign up and join the conversation on Discord