Hello I ve been testing with hybrid

At a glance

Hello! I've been testing with hybrid search for both weaviate and pinecone and got very weird results.
I am doing a search for restaurants based on their descriptions - here I am printing the name of the restaurant and the score.
It seems that bm25 is simply not working when I setup for alpha>0.
When using Weaviate (exact same setup) I got score = 1 for every alpha > 0 - only regular got the right score.
obs: this happens for no matter what query I use
obs2: I am using both the free version of pinecone and weaviate

I am happy to share more code if necessary

Plain Text

ret = idx.as_retriever(similarity_top_k=5)
bm25 = idx.as_retriever(similarity_top_k=5, vector_store_query_mode="hybrid", alpha=0.0)
hret = idx.as_retriever(similarity_top_k=5, vector_store_query_mode="hybrid", alpha=0.75)
hsnw = idx.as_retriever(similarity_top_k=5, vector_store_query_mode="hybrid", alpha=1.0)

Attachment

5 comments

LLogan M

I think this is an issue with weaviate or pinecone itself 😅 I'm not sure what the issue is here

The source code for the weaviate is here, if you want to confirm the usage
https://github.com/run-llama/llama_index/blob/31d132c56b1836603d48e02786cd29a74a28f527/llama_index/vector_stores/weaviate.py#L215

llucastonon

Sounds intriguing, I was wondering if this is something related to using the free version of both 🙂 - maybe they just dont allow it

llucastonon

I will see if I can find sometime to deep dive into the code, thanks man!

llucastonon

Also I was wondering if this is just some bug in the way we bring the scores to the retriever, because the retrievals are pretty reasonable

LLogan M

Yea it could be a bug with how the vector db generates scores for these types of retrievals? Or with how llamaindex pareses the result. Not sure

Add a reply

Find answers from the community

Hello I ve been testing with hybrid