We are using pgvector and when retreiving results from database for user query , I see that we are limiting nodes to 2 , Is there any reason why we are harcoding 2 as limit ?
Top-k increases the number of document sent to the LLM. But this can both a) increase latency and b) sometimes increase hallucination if too many documents are retrieved
One approach often used is, people might set a high top-k, and then use a reranker to filter back down to a smaller set of relevant documents