Mahmoud

Hello Guys,

I am working on a RAG project and I encountered a problem while trying to use embeddings from HuggingFace in llama-index and I was hoping that someone could help me please.
I tried multiple embeddings in French/Multilanguages from HuggingFace and they do not work (eg : https://huggingface.co/antoinelouis/biencoder-camembert-base-mmarcoFR). The only that works for me is the english BAAI-bge family.

I used the code below :

Plain Text

from llama_index.finetuning import EmbeddingQAFinetuneDataset

dataset = EmbeddingQAFinetuneDataset("path_to_dataset")

corpus = dataset.corpus
nodes = [TextNode(id_=id_, text=text) for id_, text in corpus.items()]

from llama_index

from llama_index import ServiceContext, VectorStoreIndex, Simple
from llama_index.embeddings import HuggingFaceEmbedding

embed_model = HuggingFaceEmbedding("local_path_to_model")
service_context = ServiceContext.from_defaults(embed_model=embed_model)

index = VectorStoreIndex(nodes, service_context=service_context, show_progress=True)

The last line crashes and gives me the error :

Plain Text

CUDA error : CUBLAS_STATUS_NOT_INITIALIZED when calling `cublasCreate(handle)`

When I tried to use the CPU, I get another error

Plain Text

IndexError : index out of range in self

Here are some specs :

OS : Linux
GPU : Nvidia A100 80GB of VRAM
CUDA version : 11.8 (I am unable to change this because I am not the admin on this machine).
torch version : 2.0.1 (To my knowledge, it's the latest one that works with this version of CUDA).
python version : 3.10.6
llama-index version : 0.8.65

PS : This Embedding to create a RAG works just fine in langchain
PS-2 : A coworker told me that the model I tried to use doesn't have a pooling. When used via sentence-transformers (hence langchain), a mean pooling is generated. HuggingFaceEmbedding from llama_index.embeddings might not generate the pooling.

Thank you very much for you help,
Kind regards,
Mahmoud

Find answers from the community

Hello Guys,