The community member has constructed a SummaryIndex from a collection of nodes, but has thrown away the nodes. They are wondering about the best way to retrieve the collection of nodes from the SummaryIndex stored in memory. The community member suggests that a get_nodes() function on the SummaryIndex instance would be useful, as they assume they need to iterate over the list of nodes when querying.
In the comments, another community member suggests that summary_index.docstore.docs will get a dictionary of node IDs to nodes, which seems to answer the original question.
I have constructed a SummaryIndex from a collection of nodes. I throw away the nodes. From the SummaryIndex stored in memory, what is the best way to retrieve the collection of nodes? Should I do so via the docstore? I would have thought that a get_nodes() function on the SummaryIndex instance would be useful (it should be defined on a parent class in the class hierarchy), maybe ComponentIndex. Why do I say this? Because when querying, I assume you must iterate over the list of nodes, so there should be an efficient way to access them. Thanks.
I answered my own questions by searching the source code of LlamaIndex: nodes = summary_index._index_struct.nodes, which involves accessing a private variable. This is useful to learn and use the code, but I should obviously not use this approach in any code I wish to deploy. Do the maintainers of LlamaIndex simply assume that this function is not necessary or useful?