Find answers from the community

Updated 3 months ago

We are using pgvector and when

We are using pgvector and when retreiving results from database for user query , I see that we are limiting nodes to 2 , Is there any reason why we are harcoding 2 as limit ?

Plain Text

1749450564,-0.016525879502296448]'
     LIMIT 2

6 comments

TTeemu

Do you have it set in your script or do you wish to change it?

LLogan M

The vector index has a default top k of 2, but you can change this.

index.as_query_engine(similarity_top_k=3)

nnivas

Thanks @Logan M

nnivas

So @Logan M What will be the difference in performance , recall etc.. when we change top_k ? Any suggestions/ best practices around this topic

LLogan M

Top-k increases the number of document sent to the LLM. But this can both a) increase latency and b) sometimes increase hallucination if too many documents are retrieved

One approach often used is, people might set a high top-k, and then use a reranker to filter back down to a smaller set of relevant documents

nnivas

Thanks @Logan M You are awesome !! Always helpful !!

Add a reply