Hey everyone,

At a glance

The community member is having trouble understanding the LlamaIndex documentation and is looking for the best practice to generate and store the summary of a whole document, so that the summary is used along with the particular chunks that are retrieved for any user query. The community members discuss the use of a DocumentSummaryIndex, which automates the generation of a summary for a document and uses the summary to select which document should be used to answer a query. The community member wants to customize the document summarization process and retrieve the summary as-is, without embedding the summaries to compare them with a user query. They also have questions about the behavior of the DocumentSummaryIndex, where it returns multiple nodes for a single document query. The community members suggest using lower-level APIs to implement the desired functionality, and one community member provides a detailed explanation of how the DocumentSummaryIndex and the response_synthesizer work in LlamaIndex.

Useful resources

aanatole89

Hey everyone,
I'm having trouble wrapping my head around LLamaIndex documentation.
What's the best practice to generate and store the summary of a whole document so that this summary is used along the particular chunks that are retrieved for any user query ?
Do you have an example of such an application ?

12 comments

LLogan M

This is generally what a DocumentSummaryIndex automates (generates a summary for a document, uses the summary to select which document should be used to answer a query, sends the whole selected document(s) to the LLM)

Is that the use case you imagined, or something else?

aanatole89

Yes, but :

I want to customize how the document is summarized (basically I want the summary of every document to be the concatenation of answers to a list of pre-defined questions).
Then I want to be able to retrieve this summary as-is, i.e. I know for which document I want the summary so I don't need to embed the summaries to compare them with a user query, I just need to retrieve the summary given a document ID. Is that possible ?

aanatole89

I got answers to the above questions thanks to KapaAI.

Now, what I don't understand is : in a DocumentSummaryIndex, you have one entry for each document and it contains the summary, the embedding of the summary and also the whole document.
And yet in that example : https://docs.llamaindex.ai/en/latest/examples/index_structs/doc_summary/DocSummary.html#perform-retrieval-from-document-summary-index
When asking a question about Toronto, the retriever returns 20 nodes, shouldn't it return only 1 node ?

LLogan M

Its returning all the nodes linked to that document (that document was chunked into 20 nodes)

LLogan M

Siunds like you should just be doing this yourself with lower-level APIs

Plain Text

from llama_index.core.prompts import PromptTemplate

# customize as needed
DEFAULT_TREE_SUMMARIZE_TMPL = (
    "Context information from multiple sources is below.\n"
    "---------------------\n"
    "{context_str}\n"
    "---------------------\n"
    "Given the information from multiple sources and not prior knowledge, "
    "answer the query.\n"
    "Query: {query_str}\n"
    "Answer: "
)
DEFAULT_TREE_SUMMARIZE_PROMPT = PromptTemplate(
    DEFAULT_TREE_SUMMARIZE_TMPL,
)

from llama_index.core.response_synthesizer import TreeSummarize

summarizer = TreeSummarize(llm=llm, use_async=True, summary_template=DEFAULT_TREE_SUMMARIZE_PROMPT)

# would do this for every document/text chunk
response = summarizer.get_response("query_str", ["text1"])

from llama_index.core.schema import TextNode

# here I only have one node
node = TextNode(text=resposne)
index = VectorStoreIndex(nodes=[node])

AAnatole

@kapa.ai is there a way to merge several core.response.Response ?

kkapa.ai

Let me check the knowledge base for you. One moment! :books:

kkapa.ai

----------------------
Hmm, I don't know enough to give you a confident answer yet. The provided extracts do not contain information on merging several core.response.Response in LlamaIndex. However, you can refer to the LlamaIndex documentation () for more detailed information.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

AAnatole

@kapa.ai describe precisely how a DocumentSummaryIndex works and how the response_synthetizer is used

kkapa.ai

Let me check the knowledge base for you. One moment! :books:

kkapa.ai

----------------------
A DocumentSummaryIndex in LlamaIndex is a specialized index that extracts a summary from each document and stores that summary, as well as all nodes corresponding to the document. It is particularly useful for summarizing large amounts of text from various documents.

Here's a step-by-step breakdown of how it works:

Building the Index: The DocumentSummaryIndex is built using the from_documents method. This method takes in a list of documents, an instance of a language model (llm), a list of transformations (like a SentenceSplitter), a response_synthesizer, and a boolean flag to show progress. The response_synthesizer is configured with the response_mode set to "tree_summarize". This mode is used to generate a summary of the text in the documents. The from_documents method processes each document, applies the transformations, and uses the language model to generate a summary for each document. The summaries and corresponding nodes are then stored in the index.

Plain Text

    response_synthesizer = get_response_synthesizer(
        response_mode="tree_summarize", use_async=True
    )
    doc_summary_index = DocumentSummaryIndex.from_documents(
        city_docs,
        llm=chatgpt,
        transformations=[splitter],
        response_synthesizer=response_synthesizer,
        show_progress=True,
    )

Querying the Index: To retrieve information from the DocumentSummaryIndex, you can use the retrieve method of a DocumentSummaryIndexEmbeddingRetriever instance. This method takes a query string and returns the nodes corresponding to the documents whose summaries are relevant to the query. The retrieved nodes can then be used to generate a response.

kkapa.ai

Plain Text

    retriever = DocumentSummaryIndexEmbeddingRetriever(
        doc_summary_index,
        # similarity_top_k=1,
    )
    retrieved_nodes = retriever.retrieve("What are the sports teams in Toronto?")

Generating a Response: To generate a response from the retrieved nodes, you can use a RetrieverQueryEngine instance. This query engine takes a retriever and a response_synthesizer as arguments. The query method of the query engine takes a query string and returns a response synthesized from the retrieved nodes.

Plain Text

    response_synthesizer = get_response_synthesizer(response_mode="tree_summarize")
    query_engine = RetrieverQueryEngine(
        retriever=retriever,
        response_synthesizer=response_synthesizer,
    )
    response = query_engine.query("What are the sports teams in Toronto?")
    print(response)

The response_synthesizer plays a crucial role in both building the DocumentSummaryIndex and generating responses from it. It is used to generate summaries of the documents during the index building process, and to synthesize responses from the retrieved nodes during the query process.

Sources:

Document Summary Index - Build Document Summary Index ()
Document Summary Index - Perform Retrieval from Document Summary Index ()

----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

Add a reply

Find answers from the community

Hey everyone,