Similarity_top_k limit not being respected

At a glance

The community member set similarity_top_k to 100 but only received 20-22 nodes back. They are using the AgentRunner and a basic knowledge tool. The comments suggest checking for a similarity cutoff or reranker, and that the default similarity cutoff is 0.0. The community member later solved the issue, stating they had extra logic that limited the nodes to a specific context window. Another community member asked if it's possible to set top_k to infinity to retrieve all nodes, as they are using a reranker model that performs better on lower-ranked nodes. However, the community members were informed that there is no way to set top_k to infinity, and that using a large top_k value like 9999 is a common workaround. The discussion also touched on the performance of the reranker model, with suggestions to use an API-based reranker for better latency.

Useful resources

SSaltuk

I set similiarty_top_k to the value of 100 but only get back 20-22 nodes. Why is that? I'm using the AgentRunner and a basic knowledge tool

15 comments

ssysfor

Do you have a similarity cutoff and or rerank as a node post processor?

ssysfor

Looks like the default is 0.0

similarity_cutoff: float = Field(default=0.0)

LLogan M

Maybe you only have 20-22 nodes?

LLogan M

You can test the retriever directly too

LLogan M

Plain Text

nodes = index.as_retriever(similarity_top_k=100).retrieve("test")

SSaltuk

Thanks guys, i solved it. I had some extra logic that fits my nodes into the context window, specified by some max range.

SSaltuk

👍

SSaltuk

Other question regarding this topic: Can i somehow specify top_k as infinity? So that my retriever just retrieves the max amount of nodes?

SSaltuk

I'm using a second reranker model afterwards bge-reranker-v2-m3 and it performs really really well. Actually some nodes are ranked really bad in the RetrieverQueryEngine, the 200th+ Node actually is the one that matters at the end and is correctly resolved by bge, so i would just like to pass all nodes to it.

SSaltuk

Currently using a workaround like similarity_top_k=9999, but would like to know if there is a better way

SSaltuk

But reranker is also kind of slow - if you have any better ideas how to get the better reranking performance, while maintaining acceptable response latency please tell me^^

LLogan M

Yea no way to set it to infinity. Some vector stores do have a get_nodes method, but you still need to provide either node_ids or metadata filters

LLogan M

Not sure what reranker you are using, if its running locally, its going to be largely dependant on your machine specs (having a GPU will help)

LLM rerankers are slow af in general

API-Based rerankers (like cohere) are you best bet for latency

SSaltuk

As i said i use bge-reranker-v2-m3 using FlagEmbeddingReranker. I think however there is also the possibility of directly using a BGEM3Index? Or i've seen it somewhere in the docs. Would be cool to directly apply this in the retrieval step instead of having to postprocess.

LLogan M

Hmm, decent sized model (2.5gb) -- definitely without a gpu this will be painfully slow.

It will also get slower the higher the initial top k is

There is a bge index, but it's using multivector (i.e. colbert) retrieval, so I'd expect it to be pretty resource hungry (its generating a vector per token rather than per chunk)
https://docs.llamaindex.ai/en/stable/api_reference/indices/bge_m3/

Add a reply

Find answers from the community

Similarity_top_k limit not being respected