from llama_index.embeddings import HuggingFaceEmbedding
#now, I'm loading embeddings like this HuggingFaceEmbedding(model_name='jinaai/jina-embeddings-v2-base-en',pooling='mean')
|
but it is pretrained.
I finetuned a bge for my use case , I want to load that model instead of this pretrained model
Is the fine-tuned model available on HF?
No it's not available on HF. I basically trained a bge for my use case and I have it stored in local
Okay then you can use the following, to use your own embed model
Suppose I defined a class like this :
InstructorEmbeddings(BaseEmbedding)
how to load this class to use in llamaIndex
Also Can I load a model like this for my use case ?
got it. Thank you so much for the resolution π
Hi.
I was able to integrate my custom embeddings model. But what I observed is when I'm querying, for a query when queried multiple times, embeddings generated are same but got different similarity scores. Because of which nodes order also changed. It is the same Index in both cases.
@WhiteFang_Jr @Logan M can you guys please help me this?
For the same query with no change?
Are you using chat engine? If using condense mode it changes your query. So maybe that could be one reason
No, I'm using query engine.
Yes same query with no changes. I checked the embeddings stored. They are same. But similarity score is not same.
for reference:
query_engine = index.as_query_engine(response_synthesizer=response_synthesizer, similarity_top_k=kVal)
responses = query_engine.query(qry+' in the document. Answer in a sentence')
I just checked, There was a slight difference in the similarity score like one time it detected the node with 0.70 and next it detected it with 0.71
I guess your nodes are being detected with almost similar score so maybe a little up and down is changing the order.
Yes. That is what I observed.
Is it changing the response?
Yes it is changing response.
Why is there a difference in similarity score? When query and index are same both the times.
Not really much idea on the embedding model working. We will have to wait for Logan input on this.
Meanwhile you can interact with the model directly. Check if it is generating different score for the same text.
# Iterate to check the whole process to check score for checking the variation if happening
embedding_node = embed_model.get_text_embedding(node_text)
embedding_query = embed_model.get_text_embedding(query)
# compare for score
print(embed_model.similarity(embedding_node, embedding_query))
calculating embeddings is not always 100% the same each time. Depending on your hardware, very minor variations can change the similarity score (i.e. 0.70->0.71), especially for local models.
As @WhiteFang_Jr mentioned though, you can test yourself with the above.
Not really much else I can add π
Hi @Logan M Logan. But the Embeddings created are same every time. Just the similarity score is not consisent.
Moreover, llamaIndex pipeline is giving inconsistent results. But when I calculate it manually like @WhiteFang_Jr suggested , I'm getting consistent results everytime.
Hi @WhiteFang_Jr
With llama Index I'm getting different embeddings for same query sometimes. Logan seems to say the same thing.
When I load the model and create embeddings without llamaIndex pipeline, I'm getting the same embeddings everytime.
So I'm thinking, can I some how create the embeddings locally and pass it to the llamaIndex pipeline for retrieval of similar nodes and final prediction. By locally I mean creating the embeddings by some external class without using llamaIndex pipeline.
Can we do it ?
I think you are already using the custom embedding wrapper right?
I think same is still being done, as you pass the text to your model and it returns back the embedding right.
yes I'm using a custom embedding wrapper in llama Index but embeddings are not same every time.
But If I just create embeddings outside the llamaIndex pipeline I'm getting same embeddings everytime .
I understand the Similarity part.
But I'm getting 2 embeddings for the same query without any change.
I'm attaching 2 screenshots of embeddings for reference
any idea why this might be happening ?
When I use my custom model to generate the embeddings outside llama Index pipeline then I'm getting same embeddings everytime. I'm getting different embeddings only when I generate embeddings through the llamaIndex pipeline.
Just giving extra information
yes I have my own embed model. I will try to fetch embeddings with an extra layer .
Thanks for the help π .