Find answers from the community

Updated 11 months ago

Hi, after building `VectorStoreIndex`,

Hi, after building VectorStoreIndex, is there a way to get the nodes? I'm parsing a document into hierarchical nodes, where sub nodes are IndexNode referencing the root node. Using RecursiveRetriever, I have to provide either node_dict or retriever_dict or query_engine_dict. I assume providing retriever_dict or query_engine_dict requires building an index for each root node, which could be wasteful. So I prefer node_dict. But if the VectorStoreIndex is loaded from some external storage, how could I get those nodes back (if possible)
L
s
3 comments
(retriever dict and query_engine dict are for retrieving and running retrievers/query engines)

The nodes dict is a mapping of index_id from index nodes, to a corresponding node.

Its not clear if you persisted those corresponding nodes somewhere or not (either pickle, or putting them in a docstore)
@lgfusb use a docstore, wrap the docstore in a small class to implement the .get method as a call to get_document and handling the default value; and then pass it . but be aware of https://github.com/run-llama/llama_index/issues/10454
example subclass
Plain Text
class GetDocStore(PostgresDocumentStore):
    def get(self, docid, default=None):
        # implement get method to be able to retrieve the node in a recursive_retriever
        #    raise_error=False is not general enough
        # do not use all_nodes_dict = docstore.docs  # generates full dict with all docs instead of {n.node_id: n for n in all_nodes}
        #   it is static (but possibly faster)
        try:
            doc = self.get_document(docid)
        except ValueError as err:
            doc = default
        return doc
Add a reply
Sign up and join the conversation on Discord