Find answers from the community

Updated 3 months ago

Hi all I have a list of Documents that I

Hi all, I have a list of Documents that I want to parse into nodes, and generate metadata about each node. Right now I am using the SimpleNodeParser, paired with some pre-built metadataextractors.

The question I have is regarding the SummaryExtractor. I want to create "prev" and "self" summaries for each node, to make sure that the local context of the Document is provided to the Node. However, I do not want the "prev" summary to be generated at the beginning of a new Document (referring to the first Node generated from a new Document), as this summary would refer to the last node from a previous Document (if I understand the functionality correctly), providing irrelevant context. I tried using the include_prev_next_rel, but that does not seem to resolve my issue. Should I write a custom metadata extractor for this functionality?
L
O
3 comments
hmm you could just remvoe that summary from the resulting nodes?

Otherwise yea, creating your own metadata extractor is an option. Tbh though, the current extractor probably shouldn't be doing that in the first place
I was thinking about just writing a simple check in the metadata_extractor.process_nodes() call to check if the ref id matches the previous, if not, skip. Although a temporary solution it would probably resolve it too ig
I'll think of something, thanks for the response regardless!
Add a reply
Sign up and join the conversation on Discord