What is the best way to group a list of nodes by similarity, creating a new parent node with a summary of all matched nodes and the original ones as child nodes
This would be a custom method that you'll have to write. Would have to check each node against all the nodes, identify which have more similarity and put them together and then create a parent summary of those more similar nodes.
But taking a case of 5 files of different topics, the most similar nodes would be those of single file only.
Other would have diff info IMO. And this is already handled using DocumentSummaryIndex