Question regarding node postprocessing and window:
1) Can the node parser "window" function be performed on nodes, or only documents? I have run the operation on nodes, however the "window" only includes the text from the current node. (would this require a custom parser?)
2) When running consecutive PostProcessing functions, can the 'window' text be considered in ReRankers, rather than the original 'text'?
IE I would like to process as follows:
1) docsplitter = CustomJSONNodeParser (which results in one node for each segment with the text / start / end / speaker)
2) WindowNodeParser = Include x "text" from surrounding nodes
1) Retrieve topk 10 nodes
2) Consider the node text to be the "window" metadata
3) GPT rerank based on the "window" metadata
Example of how the nodes are restructurred now using
from llama_index.core.node_parser import SentenceWindowNodeParser
# create the sentence window node parser w/ default settings
node_parser = SentenceWindowNodeParser.from_defaults(
window_size=3,
window_metadata_key="window",
original_text_metadata_key="text",
)
base_nodes = node_parser.get_nodes_from_documents(md_nodes)