Find answers from the community

Updated 3 months ago

Top_K

Hello, everyone. I am interested in this argument "similarity_top_k".

Is it possible to make it dynamic? That is, if only one paragraph matches the most - similarity_top_k == 1. If there are several really relevant ones, similarity_top_k == 2. etc.
W
Ł
4 comments
You can use similarity node postprocessor and set a threshold value for relevancy.

What this will do is bring all the relevant nodes above the set threshold value and be used for response generation.


You can then set top_k value to 10 or 15 but it will bring only those number of modes that breaches the threshold.

https://docs.llamaindex.ai/en/stable/module_guides/querying/node_postprocessors/node_postprocessors.html#similaritypostprocessor
Pass this processor in the query_engine
Like this

Plain Text
from llama_index.indices.postprocessor import SimilarityPostprocessor

postprocessor = SimilarityPostprocessor(similarity_cutoff=0.7)

query_engine = index.as_query_engine(
    similarity_top_k=10, node_postprocessors=[postprocessor]
)
response = query_engine.query(
    "How much did the author raise in seed funding from Idelle's husband"
    " (Julian) for Viaweb?",
)
I haven't done this yet, but I have plans to do this programatically - if you extract some metadata and label the nodes during node parsing, you can count the relevant nodes for each query and derive the top_k value based on that information
Add a reply
Sign up and join the conversation on Discord