Find answers from the community

Updated 4 months ago

Where do I declare `top_k` argument?

At a glance
Where do I declare top_k argument?

Plain Text
    vector_store = QdrantVectorStore(client=qdrant_vectorstore_client,
                                     collection_name=str(uuid.uuid4()))
    
    index = VectorStoreIndex.from_vector_store(vector_store=vector_store)
    retriever = index.as_retriever()
    nodes = retriever.retrieve(question)
L
N
p
9 comments
retriever = index.as_retriever(similarity_top_k=2)
you beat me to it this time.
Guys, how should I even figure it out that I need to declare that argument in the retrieve function? (general question, no hate)
I mean, it's not in the docs
Yea fair point. Its a long story. It makes more sense when you look at the low-level API. Since as_retriever() is a general shorthand to create a VectorIndexRetriever(), all args get passed into there.

And even (worse?) as_query_engine() is shorthand for creating a retriever AND a retriever query engine. Now we have double the amount of args.

Due to how many args there are, and how many indexes, the current code just abuses kwargs and passes them to underlying components.

There are a few examples of setting similarity_top_k floating around.
https://docs.llamaindex.ai/en/stable/understanding/querying/querying.html#customizing-the-stages-of-querying
arguably the docs could probably do a better job of exposing that option. Or even just better API reference docs
Yeah, I've never seen such a large amount of kwargs in my life while inspecting the source code of python libs πŸ˜„
too many things to configure πŸ˜… Huggingface even has it's own object for arguments when training -- TrainingArguments πŸ˜†
Probably a symptom of trying to make too many people happy
Add a reply
Sign up and join the conversation on Discord