Find answers from the community

Updated 5 months ago

Hi guys!

Hi guys!
Maybe a bit stupid question but do i need to use the same embbeding model for indexing and retrival? Or maybe it would be possible to index data using some large model like Alibaba-NLP/gte-Qwen2-7B-instruct and then use BAAI/bge-large-en-v1.5 for creating query_engine from vector index?
b
L
P
6 comments
Also, it is possible to pass quantization_config to embed_model? I've seen that under the hood it triggers SentenceTransformer so theoretically it would be possible to pass quantization config as model_kwargs
Yes, you should use the same model for indexing and retrieval

qwen isn't an embedding model, so not sure what you mean here exactly 🤔 There is a specific embedding model thats used to emebd text, and a specific LLM model used to generate text/responses

So qwen can be the LLM, and BGE can be your embeddings
@bszaniecki you don't strictly need to. As long as the dimensions match it'll give you results. It's just (presumably) poor practice due to differences in how the embed model was trained. I've tested this and it works fine, but it was an extremely limited test just to see if it would even work at.
@Logan M i’m talking about gte version of Qwen 2 - https://huggingface.co/Alibaba-NLP/gte-Qwen2-7B-instruct. It currently leads the mteb leaderboard (considering models with Apache license). It’s extremely useful for me thanks to its multilingual capabilities. However, I’m currently running 2xl40s and after loading 70B model with transformers I’m unable to fit qwen embedding model as it targets one card only. That’s why I was interested in using quantified embed model. I’ll do some testing and let you guys know
ah I see. I wouldn't take the MTEB too seriously once you get to the top 10. The score differences are minor, and the cost of loading a huge 7B model for embeddings isn't worth it imo, when 400M models or smaller get similar results
Just my two cents though, but sounds like it works well for multi-lingual for you
Add a reply
Sign up and join the conversation on Discord