Hello Guys,
I am working on a RAG project and I encountered a problem while trying to use embeddings from HuggingFace in llama-index and I was hoping that someone could help me please.
I tried multiple embeddings in
French/Multilanguages from HuggingFace and they do not work (eg :
https://huggingface.co/antoinelouis/biencoder-camembert-base-mmarcoFR). The only that works for me is the english BAAI-bge family.
I used the code below :
from llama_index.finetuning import EmbeddingQAFinetuneDataset
dataset = EmbeddingQAFinetuneDataset("path_to_dataset")
corpus = dataset.corpus
nodes = [TextNode(id_=id_, text=text) for id_, text in corpus.items()]
from llama_index
from llama_index import ServiceContext, VectorStoreIndex, Simple
from llama_index.embeddings import HuggingFaceEmbedding
embed_model = HuggingFaceEmbedding("local_path_to_model")
service_context = ServiceContext.from_defaults(embed_model=embed_model)
index = VectorStoreIndex(nodes, service_context=service_context, show_progress=True)
The last line crashes and gives me the error :
CUDA error : CUBLAS_STATUS_NOT_INITIALIZED when calling `cublasCreate(handle)`
When I tried to use the CPU, I get another error
IndexError : index out of range in self
Here are some specs :
- OS : Linux
- GPU : Nvidia A100 80GB of VRAM
- CUDA version : 11.8 (I am unable to change this because I am not the admin on this machine).
- torch version : 2.0.1 (To my knowledge, it's the latest one that works with this version of CUDA).
- python version : 3.10.6
- llama-index version : 0.8.65
PS : This Embedding to create a RAG works just fine in langchain
PS-2 : A coworker told me that the model I tried to use doesn't have a pooling. When used via sentence-transformers (hence langchain), a mean pooling is generated. HuggingFaceEmbedding from llama_index.embeddings might not generate the pooling.
Thank you very much for you help,
Kind regards,
Mahmoud