Creating Summaries with DocumentSummaryIndex

At a glance

The community member is using DocumentSummaryIndex to create summaries of documents and has code that uses an extended BaseExtractor class to create extra metadata. However, the community member encountered an issue where the extractor is not being called when using the DocumentSummaryIndex builder. The community member is looking for a way to pass the extractor to the builder, as they believe it would be a cleaner approach than their current method.

In the comments, another community member suggests that the extractor may be passed at the wrong place, and provides a link to an example that demonstrates how to extract metadata using DocumentSummaryIndex. However, there is no explicitly marked answer in the comments.

Useful resources

SShubyZ

Hello, I'm using DocumentSummaryIndex to create summaries of all the documents that I have. I have the code that uses an extended BaseExtractor class that would create extra metadata and also have the code that would loop through and append the metadata.

However, I ran into some code that showed that this may be done by sending an extractor into the DocumentSummaryIndex builder

summary_index = DocumentSummaryIndex.from_documents(
    documents=documents, 
    transformations=[splitter],
    response_synthesizer = response_synthesizer,
    extractors=[sentiment_extractor],
)

The above code doesn't work; the extractor is never called. Is there a way of doing the above? It's cleaner than the approach I currently have

1 comment

WWhiteFang_Jr

I think you are passing the extractor at the wrong place, Please have a look at this example: https://docs.llamaindex.ai/en/stable/examples/metadata_extraction/MetadataExtractionSEC/

Add a reply

Find answers from the community

Creating Summaries with DocumentSummaryIndex