Find answers from the community

Updated 4 months ago

shapes not aligned error

At a glance

The community member is using the HF TEI server to embed text, but is encountering a ValueError when trying to do a lookup. They have tried nuking the local database and reindexing, and have confirmed that the embedding server is being hit during the querying process. After some investigation, they realize that the issue is due to a user error - they were not actually using the embedding server when indexing, but were instead embedding locally. Once they identified and corrected this issue, the problem was resolved.

Useful resources
probably user error. I am using the HF TEI server to embed, but then when I try to do a lookup, I get this error:
ValueError: shapes (1024,) and (768,) not aligned: 1024 (dim 0) != 768 (dim 0)
https://gist.github.com/thoraxe/583ee9f8d2a21a562f42535da47cee0d
t
L
36 comments
I made sure to nuke the local db/index and reindex before trying
it's definitely hitting the embedding server during the querying
hmmm you are extra sure you blew away the old index at vector-db/ocp-product-docs ?
i will be triple extra sure one sec
Plain Text
⌁68% [thoraxe:~/Red_Hat/ … ft/llamaindex-experiments/fastapi-lightspeed-service] [fastapi-ols-39] fix-rag(+12/-1)* 1 Β± ls vector-db/
ocp-product-docs  summary-docs
⌁68% [thoraxe:~/Red_Hat/ … ft/llamaindex-experiments/fastapi-lightspeed-service] [fastapi-ols-39] fix-rag(+12/-1)* Β± rm -rf vector-db/*
⌁68% [thoraxe:~/Red_Hat/ … ft/llamaindex-experiments/fastapi-lightspeed-service] [fastapi-ols-39] fix-rag(+12/-1)* Β± ls vector-db/
⌁68% [thoraxe:~/Red_Hat/ … ft/llamaindex-experiments/fastapi-lightspeed-service] [fastapi-ols-39] fix-rag(+12/-1)* Β± 
ValueError: shapes (1024,) and (768,) not aligned: 1024 (dim 0) != 768 (dim 0)
let me crank up debug logging
hmm, i didn't get any more logging details
And you are sure the service context/embed model is the same when indexing vs. querying?
maybe add a print(service_context.embed_model) before performing each step to confirm
it should be, the code is the same
but i will check
2023-11-10 12:26:33,955 [docs_summarizer.py:76] INFO: 1234 using embed model: model_name='BAAI/bge-base-en-v1.5' embed_batch_size=10 callback_manager=<llama_index.callbacks.base.CallbackManager object at 0x7f6290d58e20> base_url='xxx' query_instruction=None text_instruction=None timeout=60.0 truncate_text=True
the model name is the same for both when I index and when I query
no, wait, it's not.
something weird is going on
user error, I knew it.
yeah ok, so in reality while I thought I was using the embedding server when I was indexing, I was in fact not because that file was not loading env from dotenv
and the shell i happened to run the test from didn't have the TEI server set, so it was actually embedding locally
i just assumed it was still working since yesterday
but this si good, now i print the embed model πŸ™‚
Nice, good catch!
well, you caught it
heh pair debugging!
/giphy teamwork
Add a reply
Sign up and join the conversation on Discord