Find answers from the community

Updated 2 years ago

Checking similarity

At a glance

The community members are discussing ways to optimize a query process without calling a large language model (LLM). They suggest using index.query(..., response_mode="no_text") to only fetch the source nodes without calling the LLM, and index.query(..., similarity_cutoff=0.5) to filter the results based on similarity. There is some uncertainty around the behavior when all nodes are filtered out. The community members also discuss increasing the number of results returned with index.query(..., similarity_top_k=4), and confirm that using similarity_cutoff without response_mode="no_text" will not call the LLM, only the embeddings model.

can I do that before I start the whole LLM QA bit?
L
d
i
13 comments
Yes! If you do something like index.query(..., response_mode="no_text") it will only fetch the source nodes and not call the LLM
You might also be interested in the similarity filtering option

index.query(..., similarity_cutoff=0.5)

But I'm not sure what the behavior is if all nodes get filtered out πŸ€”
I'll test if, an set it to 0.0 🀣
yes, I think cutoff might be what I need
maybe it should be in the demonstration or higher up in the docs?
seems like a pretty valid use case: Customers can ask about companies products, not about holidays in the Bahamas
I'll poke about in the code to see exactly what they both do
Sounds good! πŸ‘
hmmmmm, I only ever get one item in response.source_nodes
is that expected behaviour for the default mode?
Yup! You can increase this though

index.query(..., similarity_top_k=4)
Does it will call LLM when I use index.query(..., similarity_cutoff=0.5) and not specify response_mode="no_text"?
It won't call the LLM, only the embeddings model (assuming you have a vector index)
Add a reply
Sign up and join the conversation on Discord