Find answers from the community

Updated last year

Hi there, I'm working with a fine-tuned

At a glance

The community member is working with a fine-tuned embedding model for a Q/A task in the LlamaIndex framework and has observed different results between Sentence Transformers and Hugging Face Embeddings for text similarity. They are seeking insights on which embedding approach might be better suited for semantic similarity in Q/A contexts. Additionally, they are looking for information on how LlamaIndex calculates scores in query results via VectorStoreIndex and are interested in tips on optimizing embeddings and understanding the scoring mechanisms.

In the comments, another community member suggests checking the Embedding leaderboard and provides links to relevant documentation on the LlamaIndex base embed class and embeddings. The original community member follows up with a question about the scoring mechanism, specifically whether the scores in the retrieval results are based on cosine similarity, and they also reiterate their uncertainty about which embedding approach (Sentence Transformer or Hugging Face Embedding) would be more suitable for their use case.

Other community members provide additional insights, mentioning that embedding results can vary based on hardware specifications and that there might be pooling differences between Sentence Transformers and Hugging Face Embeddings, with Sentence Transformers being more automatic in terms of CLS vs. mean pooling.

Useful resources
Hi there, I'm working with a fine-tuned embedding model for a Q/A task in the LlamaIndex framework. I've observed different results between Sentence Transformers and Hugging Face Embeddings for text similarity.

  1. Does anyone have experience with which might be better for semantic similarity in Q/A contexts?
  2. Also, where can I find information on how LlamaIndex calculates scores in query results via VectorStoreIndex?
Looking for insights or tips on optimizing embeddings and understanding scoring mechanisms in this setup. Thanks!
W
C
L
5 comments
  1. You can check the Embedding leaderboard: https://huggingface.co/spaces/mteb/leaderboard and look for different parameters on which the embedding models are ranked.
  2. For your second query, checkout this base embed class: https://github.com/run-llama/llama_index/blob/main/llama_index/core/embeddings/base.py
    also you can check the docs here:
https://docs.llamaindex.ai/en/stable/module_guides/models/embeddings.html#concept
Thank you for the prompt response.
I have a follow-up question regarding the scoring mechanism used in the retrieval process. Given the code snippet:
Plain Text
storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR)
index = load_index_from_storage(storage_context)
retriever = index.as_retriever(similarity_top_k=5)
result = retriever.retrieve(question)

Could you clarify if the scores in the result are based on cosine similarity? I'm observing some differences between the similarity scores calculated directly using the embedding model for the question and the result node text, and the scores returned by the retriever. I want to make sure my implementation aligns correctly with the expected behavior.

Regarding my first question, it seems I might not have been clear. We've fine-tuned our embedding model, but I'm uncertain which embedding approach—SentenceTransformer or HuggingFaceEmbedding—would be more suited
Embedding results vary based on the hardware specs and it varies in each iteration, Once i was calculating embedding for some text it came around .70 once and in next iteration it went to 0.71

But there wont be major value change like comes down from 0,70 to 0.23 unless your embedding model goes berserk 😅 .
There might also be a pooling difference between sentence-transformers and our huggingface embeddings

Sentence transformers is a bit more automaitc (cls vs mean pooling). Ours defaults to cls, but I need to make the automatic
thanks for your help
Add a reply
Sign up and join the conversation on Discord