Tree summarize

At a glance

I see. I have a vector index and am running tree summarizer on it and it is taking a very long time because some of my documents are very large. Would I be better off making it a tree index from the start? Or maybe a list index? Every document is about the same topic but ultimately a little different

22 comments

LLogan M

Mmm it depends. Do you have the top_k set very high? Could you maybe decrease the chunk size?

A tree or list index will probably be even slower 😅 (especially a list, since it send every node in your index to the LLM)

LLogan M

Maaaaybe a tree index will be faster (since the tree is built ahead of time), but hard to say

CClay

Ah. Can you explain chunk size? How does that relate to the top_k nodes? I want to summarize over all the data so I am doing top_k=50 and the default chunk size

LLogan M

Ah I see. So it's retrieving the closest 50 chunks (that are all 3900 tokens or less, which is the default) and then constructing the tree summary on the fly. If the chunk size or top k was smaller, this miiight run a little faster.

If you want to summarize across all the data (and quickly), your best bet is a tree index it seems 👀

CClay

Ok. And does a tree index routes the query down to a specific node, or does it just create a heirarchy of summaries?

LLogan M

It creates a hierarchy of summaries, but then at query time does some stuff to iterate over the tree to generate a summary quickly

I say "does some stuff" because I haven't looked too closely at the source code yet for that one lol

CClay

Is a chunk just the entire text from a node?

LLogan M

Yessir 👍

CClay

And how are nodes parsed out from large documents?

LLogan M

By default, it just splits on tokens, with some overlap

There is also a sentence splitter, as well as you can use any splitter from langchain

CClay

Ahh I see. That’s what the overlap param is too

LLogan M

Exactly! And there's two places that chunking/overlap happens

Initial parsing from documents into nodes (default chunk size 3900, default overlap 200)
During query time, if the retrieved text does not fit into a single LLM call (default overlap 20)

LLogan M

the node_parser controls 1, while the prompt_helper controls 2

CClay

Got it! Thank you for your help! I’m going to try a tree index

CClay

Actually! I’m reading the docs and I think getting the summary from a tree index rebuilds the mini tree as well

CClay

Attachment

LLogan M

Hmmm interesting! 👀

CClay

Could this be modified? I think it would be useful if the summaries were built at insertion time

LLogan M

Maybe? I think part of it too is the summary is guided by your query.

@jerryjliu0 thoughts on tree index summaries here? Just thinking about the fastest/easiest ways to compute summaries

jjerryjliu0

@Clay sorry just catching up on this thread. out of curiosity, why are you using the tree index and not the vector index? we generally recommend vector index (with smaller chunk sizes as logan mentioned) for larger document sizes. you can set response_mode="tree_summarize when answering a question

CClay

Sorry, this is what I’m doing. I was asking if a tree index was doing the same thing as a vector index with tree summarize but we determined that it was not

jjerryjliu0

ahh gotcha - ok! sounds good

Add a reply

Find answers from the community

Tree summarize