Find answers from the community

Updated 10 months ago

Hi guys, based on your experience which

Hi guys, based on your experience which reranking model is better ? Colbert or Cohere?
s
r
8 comments
I think cohere is better but neither were perfect. I started doing the below and I have no idea if this is suggested or recommended but I seem to return better results. It's tough to say since LLMs aren't consistent always when you rerun the same query over and over

Plain Text
def main():
    rag = RagSearch()
    rerank1 = FlagEmbeddingReranker(top_n=4, model="BAAI/bge-reranker-large")
    
rerank = CohereRerank(top_n=4, api_key="<snip")
    index = VectorStoreIndex.from_vector_store(vector_store=rag.vector_store, 
                                               embed_model=Settings.embed_model)
    
    query_engine = index.as_query_engine(llm=Settings.llm,
                                         similarity_top_k=12, 
                                         node_postprocessors=[rerank1, rerank])


I seem to get better results when using both. Not sure if that's by design or not. I have no metrics. Maybe someone that knows more can weight in. I don't notice a performance hit either.
You could try running some metrics. I use Ollama local. With their OpenAPI support you save a lot of money running observability metrics. I just got it working yesterday so i haven't had time to measure the above.

Plain Text
queries_df = get_qa_with_reference(px.Client())
    retrieved_documents_df = get_retrieved_documents(px.Client())
    
    eval_model = OpenAIModel(
        api_key="ollama",
        base_url="http://192.168.0.109:1234/v1/",
        model="<model>",
    )
    
    hallucination_evaluator = HallucinationEvaluator(eval_model)
    qa_correctness_evaluator = QAEvaluator(eval_model)
    relevance_evaluator = RelevanceEvaluator(eval_model)
    
    hallucination_eval_df, qa_correctness_eval_df = run_evals(
                    dataframe=queries_df,
                    evaluators=[hallucination_evaluator, qa_correctness_evaluator],
                    provide_explanation=True,)
    
    relevance_eval_df = run_evals(
            dataframe=retrieved_documents_df,
            evaluators=[relevance_evaluator],
            provide_explanation=True)[0]

    px.Client().log_evaluations(
        SpanEvaluations(eval_name="Hallucination", dataframe=hallucination_eval_df),
        SpanEvaluations(eval_name="QA Correctness", dataframe=qa_correctness_eval_df),
        DocumentEvaluations(eval_name="Relevance", dataframe=relevance_eval_df)
    )
But if you can't run Ollama you could use OpenAI it's just expensive. While testing it cost me like $27 in < 1 hour
you could see here the results for both reranker requests from above i shared
Attachment
rerank.jpg
Plain Text
    query_engine0 = index.as_query_engine(llm=Settings.llm,
                                         similarity_top_k=12, 
                                         node_postprocessors=[rerank1])

    query_engine1 = index.as_query_engine(llm=Settings.llm,
                                         similarity_top_k=12, 
                                         node_postprocessors=[rerank])

    query_engine2 = index.as_query_engine(llm=Settings.llm,
                                         similarity_top_k=12, 
                                         node_postprocessors=[rerank1, rerank])
                                                                                      
    user_queries = ["""""",
                    """""",
                    """""",
    ]
    
    query_engines = [query_engine0, query_engine1, query_engine2]
    
    responses = rag.query_index(query_engines=query_engines, queries=user_queries )
    
    for response in responses:
        print(response, "\n\n")


^ you could do something like this (easy) to compare the results from a few different use cases/rerankers.
In this simple test, Cohere seems to be about 2-4x faster in rerank fwiw.
Add a reply
Sign up and join the conversation on Discord