Embedding

At a glance

The community member is having issues with the sentence-transformers/all-MiniLM-L6-v2 embedding model when using it with llama-index. They found that the model does not provide the expected answer, but using BAAI/bge-base-en-v1 instead does. The community member wants to know if the sentence-transformers/all-MiniLM-L6-v2 model is incompatible with llama-index or if they need to adjust other parameters to make it work.

In the comments, other community members suggest trying to decrease the chunk size while indexing or using a different model, such as a BGE model. One community member mentions that the sentence-transformers/all-MiniLM-L6-v2 model is not great and recommends using BAAI/bge-m3 instead. Another community member suggests bge-small-en-v1.5 as a good choice for a tiny model.

There is no explicitly marked answer in the post or comments.

lle_woudar

I have the following answer with the prevous code

Plain Text

The author didn't mention what they did growing up. The context only talks about the author's experiences as an adult, such as painting, working on web apps, and starting companies. There is no information about their childhood or growing up years.

by the way, if I replace "sentence-transformers/all-MiniLM-L6-v2" with "BAAI/bge-base-en-v1", I have the expected answer

Plain Text

The author wrote short stories and tried writing programs on the IBM 1401 computer in 9th grade.

I want to know if sentence-transformers/all-MiniLM-L6-v2 is not compatible with llama-index or if we need to adjust other parameters to make it work. Thank you in advance for your suggestions.

4 comments

LLogan M

That embedding model is not great 😅 you can try decreasing the chunk size while indexing (Maybe 512?) Or use another model? I usually use some BGE model

lle_woudar

yeah, it seems like the only conclusion possible. I saw in the documentation that text longer than 256 words is truncated. I tried to reduce the chunk size to 256, but it still doesn't work. too bad, I wanted to use a lightweight multilingual model, but I'll settle for BAAI/bge-m3

lle_woudar

thanks for your time anyway @Logan M

LLogan M

bge-small-en-v1.5 is probably a good choice for a tiny model imo

Add a reply

Find answers from the community

Embedding