Find answers from the community

Updated 3 months ago

@ravitheja Thanks a lot for the notebook

Thanks a lot for the notebook on "building production ready pipeline". I have a complete noob and have a few questions:

  1. Why did we choose this embedding model? embed_model = "local:BAAI/bge-small-en-v1.5"
  1. And why did we choose this model for reranking? model="BAAI/bge-reranker-base"
3.1. Which underlying model does this function use to generate questions that are used for model evaluation?
3.2 What if there are mistakes in this model's output?

async agenerate_dataset_from_nodes(num: int | None = None) β†’ QueryResponseDataset
Generates questions for each document.

  1. Relevancy:
Evaluates the relevancy of retrieved contexts and responses to a query. This evaluator considers the query string, retrieved contexts, and response string.

So why do we need an LLM (gpt4.0 in this example) for evaluating the relevancy? Relevancy tells us if the generated response is as per the retrieved contents and user query -- which means we just need the query, retriever output/context and the response string (which we got from gpt3.5).
b
r
A
3 comments
Why did we choose this embedding model? embed_model = "local:BAAI/bge-small-en-v1.5"

You can refer to the LLM leaderboard on HuggingFace for the metrics. The model seem to perform well on all major benchmarks.

And why did we choose this model for reranking? model="BAAI/bge-reranker-base"

The reranking is also task specific but the bge reranker or something from Sentence Transformer is a good starting point.

As pre the rest of the questions I would need to look at the notebook that you're referrring to would be happy to clear out a few doubts. I think a lot of startups in India especially from Bangalore are eying out at RAG pipelines. There's a great community around these folks.
@Arshdeep Kaur

  1. gpt-3.5-turbo by default
  2. GPT-4 is good for evaluations as LLM Judge, so thats the reason gpt4 was used.
Thanks @beaverTango for clearing the doubts.
Thanks @beaverTango and @ravitheja
Here is the notebook @beaverTango
Add a reply
Sign up and join the conversation on Discord