Find answers from the community

Updated 11 months ago

I am trying to implement small-to-big

I am trying to implement small-to-big retrieval and in this chunk of code

Plain Text
vector_index_chunk = VectorStoreIndex(all_nodes, 
                                              service_context=self.service_context,
                                              storage_context=self.storage_context)
index = VectorStoreIndex.from_vector_store(vector_store=self.vector_store, 
                                           service_context=self.service_context)
vector_retriever_chunk = vector_index_chunk.as_retriever(similarity_top_k=2)
retriever_chunk = RecursiveRetriever("vector",
                                     retriever_dict={"vector": vector_retriever_chunk},
                                     node_dict=all_nodes_dict,
                                     verbose=True)

if rag:
    query_engine_chunk = RetrieverQueryEngine.from_args(retriever_chunk, 
                                                        service_context=self.service_context)
    response = query_engine_chunk.query(query)
    return str(response)
else:
    result = retriever_chunk.retrieve(query)
    res = []
    for node in result:
        res += [display_source_node_custom(node, 200, True)]
    return res

I want the function to store the vector_index_chunks and then to read them from the vector store. When I run this function (the whole thing not just this snippet) with rag = False, I get:

Plain Text
ValueError: Query id c092cd56-9404-43b8-84a0-d591c2cc2dc9 not found in either `retriever_dict` or `query_engine_dict`.

I'm guessing it's an error that the relevant node was not saved somehow, but I can't figure out what exactly causes the error.
L
b
s
11 comments
This won't work with the vector store alone. node/chunks can only be retrieved by ID right now using the docstore
Oh, darn, I was hoping it would work. Indeed, if I remove the vector store, it does run. Any idea if this will be implemented in the future?
Eh, for this, its kind of recommened to use a docstore. We have quite a few docstore integrations (mongodb, redis, firestore, postgres)

For this to change
a) The base vector store class needs a get() method
b) Every (or most) vector stores need to implement this
c) Logic has to be updated under the hood to optionally pull from the vector store vs. docstore
given the amount of work, its kind of low priority lol unless a community member tackles it
Sounds good, thanks πŸ™‚
@Logan M ideally we can pass a docstore to the recursiveretriever, now we have to first make a huge dict and then pass it.
looking at the code, making a RecursiveRetriever instance and then setting the _node_dict attribute to the docstore will probably work?
hmmm, it's not that simple. there is no .get for the docstore, but we probably only need the .get_document for it?
docstore.docs is a dict of node_id -> nodes, I think that works?
in a kv docstore, it's a property that builds up the whole dict; and more direct method/property to get the id would be nice to have. i'll see if i can find what i want. the good thing is that the docstore has no .get, so i can implement one myself in a subclass πŸ˜‰
actually, i passing a dict means it's not updated, so it's quite annoying if you live update the index
Add a reply
Sign up and join the conversation on Discord