Huggingface

At a glance

The post describes an error encountered when trying to use the "Cohere/Cohere-embed-english-v3.0" model with the LlamaIndex library. The error indicates that the model is not recognized by the library, and the community members discuss possible reasons for this issue.

The comments suggest that Cohere did not actually release the model weights, and the files on Hugging Face only contain the tokenizer. This means the model can only be used through Cohere's API, and the community members recommend adding a warning in the documentation to clarify this requirement.

Additionally, the community members note that the LlamaIndex documentation implies that any model from the MTEB leaderboard can be used, but this is not the case, and there are some requirements that users need to be aware of.

Useful resources

tthoraxe

Plain Text

Traceback (most recent call last):
  File "/opt/app-root/src/llamaindex-rag-example/starter.py", line 8, in <module>
    embed_model = HuggingFaceEmbedding(model_name="Cohere/Cohere-embed-english-v3.0")
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/llama_index/embeddings/huggingface.py", line 82, in __init__
    model = AutoModel.from_pretrained(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/transformers/models/auto/auto_factory.py", line 526, in from_pretrained
    config, kwargs = AutoConfig.from_pretrained(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/transformers/models/auto/configuration_auto.py", line 1132, in from_pretrained
    raise ValueError(
ValueError: Unrecognized model in Cohere/Cohere-embed-english-v3.0. Should have a `model_type` key in its config.json, or contain one of the following strings in its name: ...

looks like you don't support all embeding models from the mteb leaderboard.

7 comments

LLogan M

Cohere did not actually release the model weights

LLogan M

So you can only use it over their api

tthoraxe

sounds like you need a warning in your docs 🤷‍♂️

LLogan M

The huggingface files only have the tokenizer https://huggingface.co/Cohere/Cohere-embed-english-v3.0/tree/main

LLogan M

I think this is just a symptom that anyone can upload anything to huggingface? But sure, docs could use a warning for those less familiar with huggingface

tthoraxe

the way that https://docs.llamaindex.ai/en/stable/examples/embeddings/huggingface.html is written, the implication is that you can just pick something from the leaderboard and you're good

tthoraxe

but there are clearly some requirements 🙂

Add a reply

Find answers from the community

Huggingface