Find answers from the community

Updated last year

so for things like faithfulness and

At a glance

The community members discuss the generation of scores for faithfulness and relevancy. One community member confirms that these scores are generated by an outside language model (LLM), rather than a formula. Another community member asks if other metrics like BLEU or ROUGE can be used, and is told that while they are open to adding them, those metrics require ground truth data which is often not available. The discussion also covers the limitations of ROUGE scores and the choice of metrics used, with the community members noting that in many cases, people don't have ground truth data, so those metrics were not a high priority. The community members are open to accepting contributions that add new metrics to the system.

so for things like faithfulness and relevancy, the scores are generated by some outside LLM right? There's no formula that's used to generate that value?
L
G
10 comments
Correct, its LLM as a judge (except for the semantic similarity one, that's just cosine similarity)
Thanks! I saw that cosine similarity can be switched for things like dot product and Euclidean distance, but is there any way to use other metrics like Bleu or Rouge?
open to adding them (those assume you have ground truth to compare to)
but they have their own pitfalls
yes, I've been using a labled rag dataset example from llamahub for right now
also interesting, is there a reason why you chose those metrics (did were they good enough for measuring semantic similarity), or was it just a speed/ease of implementation situation?
In most cases people don't have ground truth to compare to, so it was a lower priority. Also, imo, they are a tad less helpful? maybe a hot take hahaha

Theres so many ways to write a response. A rouge score of 30 isn't really that informative, even if its what academia has clung to the past few years
But if some contributes it to the repo, it will definitely get merged πŸ™‚
sounds good! I'll see what I can do
Add a reply
Sign up and join the conversation on Discord