Warning: No Hugging Face Embedding Model Found from Loc...

At a glance

The community member is trying to load a Hugging Face embedding model from local storage using the HuggingFaceEmbedding wrapper. However, they are encountering a warning that the model is not found in the specified local directory. The community member has tried using the code embed_model_local_path = Path("local_storage_for_embedding_model")Settings.embed_model = HuggingFaceEmbedding(model_name=str(embed_model_local_path), local_files_only=True) to load the BAAI/bge-small-en-v1.5 model from the local directory.

In the comments, another community member suggests that the model name can be directly provided to the HuggingFaceEmbedding constructor, and it will first look in the cache folder before downloading the model. The community members discuss whether the local_files_only=True parameter can be used to enforce loading the model from the local files, but it is not clear if this parameter is available in the HuggingFaceEmbedding class.

There is no explicitly marked answer in the comments, but the community members seem to have reached an understanding that providing the model name directly to the HuggingFaceEmb

Useful resources

ggdialektakis

Hi all, I am trying to load a hugging face embedding model from local storage using the wrapper HuggingFaceEmbedding from here : https://docs.llamaindex.ai/en/stable/examples/embeddings/huggingface/. However, not matter what I have tried I can't make it load the model properly from local storage as I get the following warning: WARNING - No sentence-transformers model found with name local_storage_for_embedding_model. Creating a new one with mean pooling.

The code I use is:

embed_model_local_path = Path("local_storage_for_embedding_model")

Settings.embed_model = HuggingFaceEmbedding(
model_name=str(embed_model_local_path), # Pass the temp directory path
local_files_only=True
)

and I try to load BAAI/bge-small-en-v1.5 (I have download all the necessary files from huggingface to my local system in a folder with name "local_storage_for_embedding_model" in the root directory of my repository)
Can anyone please help? Thanks in advance.

12 comments

WWhiteFang_Jr

You don't need to provide the model path in this, just provide the model name.

Plain Text

from llama_index.embeddings.huggingface import HuggingFaceEmbedding

# loads BAAI/bge-small-en
# embed_model = HuggingFaceEmbedding()

# loads BAAI/bge-small-en-v1.5
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")

If you run this same line it will first look into cache folder and load the model if it exists.

ggdialektakis

But how do I enforce to load it from local file, because I think like this it will be downloaded everytime no?

WWhiteFang_Jr

No, When this line runs: embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5") it will first look into cache folder, if the model is there it will pick that. If not then only it will download the model

ggdialektakis

so it will be in the cache folder as long as it has been downloaded at least once right?

WWhiteFang_Jr

Yes

ggdialektakis

Ok I will try thanks

ggdialektakis

can I also use local_files_only=TRUE

ggdialektakis

WWhiteFang_Jr

Not seeing this variable mentioned in the HF embedding class, What do you want to achive with this?

ggdialektakis