Find answers from the community

s
F
Y
a
P
Updated 10 months ago

Hi there. When we talk about

Hi there. When we talk about multilingual models for embeddings, what do we mean exactly?
  1. that we can embed efficiently a document from any of the supported language and query in any of the supported languages (=> embeddings would encode the same representation whatever the original language)
  2. that we can embed efficiently a document from any of the supported language but query must be from the same language
I'm asking because I noticed that a query in French did not retrieve the relevant English document whereas the English-translated query retrieved it perfectly. Doing a test by switching roles between FR and EN showed the same behaviour.
V
p
4 comments
Both.

#1 works well for translation services.

#2 works well for language detection/classification services.

What you are experiencing, however, sounds more like a loss-of-precision problem with English-French translation itself to me. In other words, it's more of a linguistic problem.
@Vicent W. Thanks for your answer. What do you mean by English-French translation problem? That translated query would be wrong for example? I checked myself a couple of examples and everything seems fine with that regard
Sure. Here's what Google gave me:
Here are some French words that are untranslatable in English:
  • DĂŠpaysement: A common French word that has no direct English translation. It's often translated as a change of scenery, but it can also refer to the feeling of being out of your comfort zone or abroad in an unfamiliar place.
  • Retrouvailles: A sweet French word that describes the happiness of reuniting with someone you haven't seen in a long time.
  • VoilĂ : A common French word that can be used in many situations. It translates roughly to "there you go" or "there it is" and is used to draw attention to something that has recently happened or is nearby.
  • Flâner: A French concept that can be translated as "to stroll" or "to lounge". A flâneur is someone who wanders through a city without a destination, but with the purpose of observing the world in a philosophical way.
  • Cartonner: A slang term used to describe successful films. It comes from the action of covering an object in cardboard.
  • La Douleur Exquise: Translates to "the exquisite agony" and describes the extreme pain you feel when you have feelings for someone who isn't giving it back to you.
  • Yaourter: Literally translates to "to yogurt" and is used to describe singing in a foreign language and getting the words wrong or filling in the words with sounds like tra-la-la.
  • Bon Vivant: Translates to "good liver" and refers to a person, not the organ.
  • MĂŠconnaissance: Corresponds roughly to the English words "misunderstanding" and "misrecognition".

I'm not saying your query contained any word like these, but I just want to call it out that there are concepts that just don't translate well in one direction, but not the other way around. I kinda suspect that the asymmetry you experienced is a manifest of this phenomenon.
Ok, that's very clear. Thanks a lot for your detailed answer. Will look into this direction
Add a reply
Sign up and join the conversation on Discord