If combined length of all texts is less than context window, it seems not performing tree based summarization, but rather just stacking all texts and making one call only?
What If I want to force it to summarize each text separately and perform tree based summarization?
For example, I have a document of size 20k tokens. And I declare a function to first split that text to chunks of size ~4k.
So I will have a list of size 5 (20k/4k).
And I see that my tree-summarizer performs summarization in one api call (basically sending everything at once). Why it is not performing tree-summarization?
So when combined length of all texts in a list is less than a model’s context window it seems not doing it, but just stacking all texts together and making one API call.
ngl I think thats less of an issue with newer LLMs these days 👀 If you want, you could try using Refine to sequentially create a rolling summary, and see how that looks
one question, did you try to trace the openai call usage using open-source like LangFuse? I think the current Tree Summarize callback manager is not working as I can't track down any openai call the Tree Summarize object made, they also have 1-2 "#TODO" in their newest version code
@vietphuon no, I did not trace it. But when I call summarizer.summarize I see some stdouts. Something like: 1 chunk found or something like that. So I assume that there was only one API call, but can't guarantee