Find answers from the community

Updated 6 months ago

TreeSummarize is just combining all texts

I have a question about TreeSummarize class.

summarizer.summarize(prompt, [texts])

If combined length of all texts is less than context window, it seems not performing tree based summarization, but rather just stacking all texts and making one call only?

What If I want to force it to summarize each text separately and perform tree based summarization?

For example, I have a document of size 20k tokens. And I declare a function to first split that text to chunks of size ~4k.

So I will have a list of size 5 (20k/4k).

And I see that my tree-summarizer performs summarization in one api call (basically sending everything at once). Why it is not performing tree-summarization?
p
L
v
8 comments
by tree based summarization I mean:

chunk1+chunk2=summary1
chunk3+chunk4=summary2
summary1+summary2=summary3

etc.

So when combined length of all texts in a list is less than a model’s context window it seems not doing it, but just stacking all texts together and making one API call.
or is it meant to work so?
Yea it compacts the chunks to save LLM calls -- I dont actually think there is an option to turn that off
(I would expect a single LLM call to perform better though? And much faster)
@Logan M It is faster, yes, but what about “lost in the middle” problem?
ngl I think thats less of an issue with newer LLMs these days 👀 If you want, you could try using Refine to sequentially create a rolling summary, and see how that looks
one question, did you try to trace the openai call usage using open-source like LangFuse? I think the current Tree Summarize callback manager is not working as I can't track down any openai call the Tree Summarize object made, they also have 1-2 "#TODO" in their newest version code
@vietphuon no, I did not trace it. But when I call summarizer.summarize I see some stdouts. Something like: 1 chunk found or something like that. So I assume that there was only one API call, but can't guarantee
Add a reply
Sign up and join the conversation on Discord