Find answers from the community

Updated 4 months ago

Are the InstructorEmbeddings not working

At a glance

The community member is experiencing issues with the InstructorEmbeddings in the llamaindex library, encountering errors related to the SentenceTransformer package. The community members discuss the possibility of a bug in one of the dependencies, and suggest using an older version of the sentence-transformers package as a workaround. They also recommend submitting an issue on the llamaindex repository to investigate the problem further. Additionally, the community members note that maintaining llamaindex can be challenging due to the rapidly evolving dependencies in the field.

Useful resources
Are the InstructorEmbeddings not working right now?

I have tried multiple examples including https://docs.llamaindex.ai/en/stable/examples/embeddings/huggingface.html#huggingfaceembedding
And
https://docs.llamaindex.ai/en/stable/examples/embeddings/custom_embeddings.html

Both result in the error:

Plain Text
/usr/local/lib/python3.10/dist-packages/sentence_transformers/SentenceTransformer.py in __init__(self, model_name_or_path, modules, device, cache_folder, trust_remote_code, revision, token, use_auth_token)
    192 
    193             if is_sentence_transformer_model(model_name_or_path, token, cache_folder=cache_folder, revision=revision):
--> 194                 modules = self._load_sbert_model(
    195                     model_name_or_path,
    196                     token=token,

TypeError: INSTRUCTOR._load_sbert_model() got an unexpected keyword argument 'token'
L
c
a
19 comments
oh thats weird
thats coming directly from SentenceTransformer (and likely above, the instructor package?)
I'm guessing one of those packages has a bug
I guess so, hmm... is there a prior version I should be using?
You can use llama_index.legacy to get things working for now (please don't mix .legacy imports with any new imports from v0.10 .

Would you be able to submit an issue for this?

https://github.com/run-llama/llama_index/issues/new/choose
Sure, no problem.
We can take a look at how to resolve it from there πŸ™‚ . Thanks!
@nerdai @Logan M

It i s an issue with the new Sentence Transformer.

A quick fix was to use

pip install sentence-transformers==2.2.2
great! should we pin this in the pyproject.toml ?
mmm maybe. Or at least until they fix it? haha
yea we can leave a note in the pyproject.toml to remove the pin once they fix it πŸ™‚
Must be very challenging to maintain llamaindex at this point with so many dependencies in a field that is rapidly evolving.
tbh the recent split to packages per integration made it much easier haha
but it does move fast yea
Still a bit stuck with this instructorembedding. Now I am getting the following error when running:

Plain Text
embed_model = InstructorEmbeddings(embed_batch_size=2)

Settings.embed_model = embed_model
Settings.chunk_size = 512

# if running for the first time, will download model weights first!
index = VectorStoreIndex.from_documents(documents)


Plain Text
FileNotFoundError: [Errno 2] No such file or directory: '/root/.cache/torch/sentence_transformers/intfloat_multilingual-e5-large-instruct/sentence_xlnet_config.json'


https://huggingface.co/intfloat/multilingual-e5-large-instruct/tree/main

This file is not part of the model files. Not sure why it is looking for it.

I am following https://docs.llamaindex.ai/en/stable/examples/embeddings/custom_embeddings.html
Attachment
2Q.png
Well, anyway, it does work with hkunlp/instructor-large so maybe this is beyond the scope of llamaindex help haha.
Add a reply
Sign up and join the conversation on Discord