Are the InstructorEmbeddings not working

Are the InstructorEmbeddings not working right now?

I have tried multiple examples including https://docs.llamaindex.ai/en/stable/examples/embeddings/huggingface.html#huggingfaceembedding
And
https://docs.llamaindex.ai/en/stable/examples/embeddings/custom_embeddings.html

Both result in the error:

Plain Text

/usr/local/lib/python3.10/dist-packages/sentence_transformers/SentenceTransformer.py in __init__(self, model_name_or_path, modules, device, cache_folder, trust_remote_code, revision, token, use_auth_token)
    192 
    193             if is_sentence_transformer_model(model_name_or_path, token, cache_folder=cache_folder, revision=revision):
--> 194                 modules = self._load_sbert_model(
    195                     model_name_or_path,
    196                     token=token,

TypeError: INSTRUCTOR._load_sbert_model() got an unexpected keyword argument 'token'

19 comments

oh thats weird

thats coming directly from SentenceTransformer (and likely above, the instructor package?)

I'm guessing one of those packages has a bug

I guess so, hmm... is there a prior version I should be using?

You can use llama_index.legacy to get things working for now (please don't mix .legacy imports with any new imports from v0.10 .

Would you be able to submit an issue for this?

https://github.com/run-llama/llama_index/issues/new/choose

Sure, no problem.

We can take a look at how to resolve it from there 🙂 . Thanks!

https://github.com/run-llama/llama_index/issues/11037

🙏

@nerdai @Logan M

It i s an issue with the new Sentence Transformer.

A quick fix was to use

pip install sentence-transformers==2.2.2

Good catch!

great! should we pin this in the pyproject.toml ?

mmm maybe. Or at least until they fix it? haha

yea we can leave a note in the pyproject.toml to remove the pin once they fix it 🙂

Must be very challenging to maintain llamaindex at this point with so many dependencies in a field that is rapidly evolving.

tbh the recent split to packages per integration made it much easier haha

but it does move fast yea

Still a bit stuck with this instructorembedding. Now I am getting the following error when running:

Plain Text

embed_model = InstructorEmbeddings(embed_batch_size=2)

Settings.embed_model = embed_model
Settings.chunk_size = 512

# if running for the first time, will download model weights first!
index = VectorStoreIndex.from_documents(documents)

Plain Text

FileNotFoundError: [Errno 2] No such file or directory: '/root/.cache/torch/sentence_transformers/intfloat_multilingual-e5-large-instruct/sentence_xlnet_config.json'

https://huggingface.co/intfloat/multilingual-e5-large-instruct/tree/main

This file is not part of the model files. Not sure why it is looking for it.

I am following https://docs.llamaindex.ai/en/stable/examples/embeddings/custom_embeddings.html

Attachment