Where do I declare `top

At a glance

Where do I declare top_k argument?

Plain Text

    vector_store = QdrantVectorStore(client=qdrant_vectorstore_client,
                                     collection_name=str(uuid.uuid4()))
    
    index = VectorStoreIndex.from_vector_store(vector_store=vector_store)
    retriever = index.as_retriever()
    nodes = retriever.retrieve(question)

9 comments

LLogan M

retriever = index.as_retriever(similarity_top_k=2)

NNPC_Kenny

you beat me to it this time.

ppikachu8887867

Guys, how should I even figure it out that I need to declare that argument in the retrieve function? (general question, no hate)

ppikachu8887867

I mean, it's not in the docs

LLogan M

Yea fair point. Its a long story. It makes more sense when you look at the low-level API. Since as_retriever() is a general shorthand to create a VectorIndexRetriever(), all args get passed into there.

And even (worse?) as_query_engine() is shorthand for creating a retriever AND a retriever query engine. Now we have double the amount of args.

Due to how many args there are, and how many indexes, the current code just abuses kwargs and passes them to underlying components.

There are a few examples of setting similarity_top_k floating around.
https://docs.llamaindex.ai/en/stable/understanding/querying/querying.html#customizing-the-stages-of-querying

LLogan M

arguably the docs could probably do a better job of exposing that option. Or even just better API reference docs

ppikachu8887867

Yeah, I've never seen such a large amount of kwargs in my life while inspecting the source code of python libs 😄

LLogan M

too many things to configure 😅 Huggingface even has it's own object for arguments when training -- TrainingArguments 😆

LLogan M

Probably a symptom of trying to make too many people happy

Add a reply

Find answers from the community

Where do I declare `top_k` argument?