Hello, I'm trying to use a RecursiveRetriever following...

At a glance

@kapa.ai Hello, I'm trying to use a RecursiveRetriever following this doc: https://docs.llamaindex.ai/en/stable/examples/retrievers/recursive_retriever_nodes/?h=indexnode. I'm storing my vectors into qdrant. When I save all IndexNodes into qdrant using

Plain Text

vector_index_chunk = VectorStoreIndex(all_nodes, embed_model=embed_model, storage_context=storage_context)

and I try to retrieve them later I have an error like

Plain Text

ValueError: Query id e7cec67d-e470-4ee7-a737-5f765c4fea27 not found in either `retriever_dict` or `query_engine_dict`.

`
This is how I instantiate my retriever:

Plain Text

vector_retriever_chunk = VectorStoreIndex(
            nodes=[],
            embed_model=embedding_model,
            vector_store=vector_store,
            storage_context=storage_context,
        ).as_retriever(similarity_top_k=config.chunk_retrieved_max)
recursive_retriever = RecursiveRetriever("vector", retriever_dict={"vector": vector_retriever_chunk})

Do I have to store IndexNodes into Qdrant ?

8 comments

kkapa.ai

Please stand by @Bob as I search the knowledge base. I normally take about 20 seconds or so. :wave:

kkapa.ai

----------------------
Yes, you need to store your IndexNodes into Qdrant. The VectorStoreIndex class is used to create an index of your nodes, which are then stored in Qdrant. This is done by passing your nodes and the StorageContext (which includes your Qdrant client) to the VectorStoreIndex constructor, as you've done in your code.

However, the error message you're seeing suggests that the query id you're trying to retrieve is not found in either the retriever_dict or query_engine_dict. This could be because the IndexNode with that id was not included when you created your VectorStoreIndex, or it could be a problem with how you've set up your RecursiveRetriever.

In your RecursiveRetriever instantiation, you're passing a dictionary with the key "vector" and the value vector_retriever_chunk to the retriever_dict parameter. This means that when you try to retrieve a node, the RecursiveRetriever will look for a retriever with the id "vector" in the retriever_dict. If the id of the node you're trying to retrieve is not "vector", you'll get a ValueError.

To fix this, you need to make sure that the ids of the nodes you're trying to retrieve match the keys in your retriever_dict. For example, if you're trying to retrieve a node with the id "e7cec67d-e470-4ee7-a737-5f765c4fea27", you should have a retriever with that id in your retriever_dict.

Here's an example of how you might do this:

Plain Text

# Create a dictionary mapping node ids to retrievers
retriever_dict = {node.node_id: vector_retriever_chunk for node in all_nodes}

# Instantiate the RecursiveRetriever
recursive_retriever = RecursiveRetriever("vector", retriever_dict=retriever_dict)

kkapa.ai

In this code, all_nodes is a list of all your IndexNodes. This will create a retriever_dict where each node id is mapped to a retriever, which should prevent the ValueError you're seeing.

For more information, you can refer to the LlamaIndex documentation ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

BBob

@kapa.ai I don't have these node ids when I retrieve from a query. All these ids have been generated during the indexation process. I need to retrieve the document later with a simple text query.

BBob

Hello @Logan M do you have any suggestion about that ?

LLogan M

This is a tad outdated. You can just give the index all you nodes and index nodes

Plain Text

sub_chunk_sizes = [128, 256, 512]
sub_node_parsers = [
    SentenceSplitter(chunk_size=c, chunk_overlap=20) for c in sub_chunk_sizes
]

all_nodes = []
for base_node in base_nodes:
    for n in sub_node_parsers:
        sub_nodes = n.get_nodes_from_documents([base_node])
        sub_inodes = [
            IndexNode.from_text_node(sn, base_node.node_id) for sn in sub_nodes
        ]
        all_nodes.extend(sub_inodes)

    # also add original node to node
    original_node = IndexNode.from_text_node(base_node, base_node.node_id)
    original_node.obj = base_node
    all_nodes.append(original_node)

index = VectorStoreIndex(all_nodes, embed_model=embed_model)
retriever = index.as_retriever(similarity_top_k=4, verbose=True)
retriever.retrieve("test")

If it retrieves an index node, it will be replaced with the node/obj that is attached to it

BBob

I don't have all_nodes when I instantiate my retriever because the indexation is done before on another server.

LLogan M

Well, they need to end up in the vector db at some point. index.insert_nodes(nodes) works too

Add a reply

Find answers from the community

Hello, I'm trying to use a RecursiveRetriever following this doc: https://docs.llamaindex.