----------------------
For summarization tasks, the best index to use with LlamaIndex is the
SummaryIndex
. As the name suggests, it is designed to generate a summary of the text in your documents. It stores all of the documents and returns all of them to your query engine. This index is best suited to queries where you are trying to generate a summary of the text in your documents.
For instance, a summarization query could look like one of the following:
- “What is a summary of this collection of text?”
- “Give me a summary of person X’s experience with the company.”
In general, a summary index would be suited for this use case. A summary index by default goes through all the data. Empirically, setting
response_mode="tree_summarize"
also leads to better summarization results.
Here is an example of how to use it:
index = SummaryIndex.from_documents(documents)
query_engine = index.as_query_engine(response_mode="tree_summarize")
response = query_engine.query("<summarization_query>")
You can find more information about the
SummaryIndex
in the
documentation () and
notebooks ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------