Find answers from the community

Updated 3 months ago

Are the InstructorEmbeddings not working

Are the InstructorEmbeddings not working right now?

I have tried multiple examples including https://docs.llamaindex.ai/en/stable/examples/embeddings/huggingface.html#huggingfaceembedding
And
https://docs.llamaindex.ai/en/stable/examples/embeddings/custom_embeddings.html

Both result in the error:

Plain Text
/usr/local/lib/python3.10/dist-packages/sentence_transformers/SentenceTransformer.py in __init__(self, model_name_or_path, modules, device, cache_folder, trust_remote_code, revision, token, use_auth_token)
    192 
    193             if is_sentence_transformer_model(model_name_or_path, token, cache_folder=cache_folder, revision=revision):
--> 194                 modules = self._load_sbert_model(
    195                     model_name_or_path,
    196                     token=token,

TypeError: INSTRUCTOR._load_sbert_model() got an unexpected keyword argument 'token'
L
c
a
19 comments
oh thats weird
thats coming directly from SentenceTransformer (and likely above, the instructor package?)
I'm guessing one of those packages has a bug
I guess so, hmm... is there a prior version I should be using?
You can use llama_index.legacy to get things working for now (please don't mix .legacy imports with any new imports from v0.10 .

Would you be able to submit an issue for this?

https://github.com/run-llama/llama_index/issues/new/choose
Sure, no problem.
We can take a look at how to resolve it from there πŸ™‚ . Thanks!
@nerdai @Logan M

It i s an issue with the new Sentence Transformer.

A quick fix was to use

pip install sentence-transformers==2.2.2
great! should we pin this in the pyproject.toml ?
mmm maybe. Or at least until they fix it? haha
yea we can leave a note in the pyproject.toml to remove the pin once they fix it πŸ™‚
Must be very challenging to maintain llamaindex at this point with so many dependencies in a field that is rapidly evolving.
tbh the recent split to packages per integration made it much easier haha
but it does move fast yea
Still a bit stuck with this instructorembedding. Now I am getting the following error when running:

Plain Text
embed_model = InstructorEmbeddings(embed_batch_size=2)

Settings.embed_model = embed_model
Settings.chunk_size = 512

# if running for the first time, will download model weights first!
index = VectorStoreIndex.from_documents(documents)


Plain Text
FileNotFoundError: [Errno 2] No such file or directory: '/root/.cache/torch/sentence_transformers/intfloat_multilingual-e5-large-instruct/sentence_xlnet_config.json'


https://huggingface.co/intfloat/multilingual-e5-large-instruct/tree/main

This file is not part of the model files. Not sure why it is looking for it.

I am following https://docs.llamaindex.ai/en/stable/examples/embeddings/custom_embeddings.html
Attachment
2Q.png
Well, anyway, it does work with hkunlp/instructor-large so maybe this is beyond the scope of llamaindex help haha.
Add a reply
Sign up and join the conversation on Discord