Find answers from the community

Updated 5 months ago

Why is the embedding null when uploading

At a glance

The community members are experiencing an issue where the embedding is null when uploading to Qdrant, but manually executing the embedding computation yields non-null results. They have determined that this issue is not specific to Qdrant, as they can reproduce it with Chroma as well. The issue seems to be related to the llama-index library, where the embedding field of the node is not being populated when retrieving the data. A community member suggests that there may be a PR to attach the embedding to the node in the vector database being used.

The community members have also encountered a strange issue where the problem only occurs in a Docker container, but not in their local setup. They have narrowed it down to a difference in the version of the llama-index library (0.8.41 in Docker vs. 0.8.46 locally), and have found that the newer version works fine locally. However, they are still unsure why the Docker version of the retriever is failing in this way.

After some investigation, a community member suggests that the issue may be related to the query being passed in the Docker container, as it is not a string but a Chainlit message that needs to be unwrapped. This turns out to be the correct solution, as the community members find that pinning Chainlit to a specific patch version resolves the issue

Why is the embedding null when uploading it to qdrant? What can I do to debug it? manually executing the embedding computation yields non-null results
g
L
d
16 comments
In fact this is unrelated to qdrant - I can totally reproduce this with chroma
I.e. embedding_model.get_text_embedding(documents[0].text) is returning a value, however, VectorStoreIndex.from_documents when queried via index.as_retriever().retrieve('foo')[0].embedding is null/empty
It's not that it's empty, it's just the llama-index is not populating the embedding field of the node when retirveing (the embedding is stored separately from the node)

There code be a PR to attach the embedding to the node in the vector db you are using
| result = self.index.as_retriever(similarity_top_k=10).retrieve(query)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| File "/opt/conda/lib/python3.11/site-packages/llama_index/indices/base_retriever.py", line 22, in retrieve
| return self._retrieve(str_or_query_bundle)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| File "/opt/conda/lib/python3.11/site-packages/llama_index/indices/vector_store/retrievers/retriever.py", line 81, in _retrieve
| if query_bundle.embedding is None and len(query_bundle.embedding_strs) > 0:
as I am running into 'Message' object has no attribute 'embedding'
but only in the docker container
my local installation works fine
locally I have a 0.8.41 in docker where it fails 0.8.46
However, when using the newer version locally it also works.
Do you have any idea why the docker version of the retriever is failing in this strange way?
hmm that's quite strange. maybe check if the query you pass in in the docker container is indeed a string?
A simple python script inside docker works just fine. However, a chainlit based fails with this error
Wow - indeed this was the right pointer. I have no idea why inside docker the message is not a string (it ineed is a string in the local setup) but it is a chainlit message which needs to be unwrapped. @disiok thanks!
turns out chainlit was only pinned to the Minor and not to Patch - and they recently bumped from 0.3 0.300 ...
glad it helped!
Add a reply
Sign up and join the conversation on Discord