Find answers from the community

Updated 7 months ago

Hello, I use VectorStoreIndex to

At a glance

The community member is using VectorStoreIndex to evaluate an embedding model and is curious about the default similarity function. They noticed that the node scores are all between 0 and 1, rather than the expected range of -1 to 1 for cosine similarity. The community members suggest that the similarity function may be using a normalized L2 function instead. They also provide information on how to change the default similarity function, by subclassing the embedding model and changing the similarity mode. However, the original community member still has a question about why the similarity scores are limited to the 0-1 range, even when using the default cosine similarity function.

Useful resources
Hello, I use VectorStoreIndex to evaluate embedding model. I think default similarity function is cos similarity. But as I check node score, they are all between (0, 1) rather than (-1, 1). It seems like maybe normalized L2 function is being used.
Why all similarity score is between 0-1? Can I change default similarity function of VectorStoreIndex to some specific function instead of default cos function?
W
A
L
6 comments
You'll have to subclass your embedding model and change the similarity mode as per your fit.
Thanks a lot. With your kind help now I know how to change similarity fn. But still 1 more question. Why all similarity scores are between (0, 1). Since I am using default COSINE fn, it should be in (-1, 1) right?
Attachment
9WOHTs0Y8YMXXnllQYnA2A1PrUXgBGtra1atGjRGfurqqr0uuvWz8QAGOIEQAAYBTXjAAAAKOIEQAAYBQxAgAAjCJGAACAUcQIAAAwihgBAABGESMAAMAoYgQAABhFjAAAAKOIEQAAYBQxAgAAjCJGAACAUf8HV33B7M82R0AAAAASUVORK5CYII.png
Maybe @Logan M can help with this query
This is the default calculation

Plain Text
product = np.dot(embedding1, embedding2)
norm = np.linalg.norm(embedding1) * np.linalg.norm(embedding2)
return product / norm
Add a reply
Sign up and join the conversation on Discord