The community members are experiencing an issue where the embedding is null when uploading to Qdrant, but manually executing the embedding computation yields non-null results. They have determined that this issue is not specific to Qdrant, as they can reproduce it with Chroma as well. The issue seems to be related to the llama-index library, where the embedding field of the node is not being populated when retrieving the data. A community member suggests that there may be a PR to attach the embedding to the node in the vector database being used.
The community members have also encountered a strange issue where the problem only occurs in a Docker container, but not in their local setup. They have narrowed it down to a difference in the version of the llama-index library (0.8.41 in Docker vs. 0.8.46 locally), and have found that the newer version works fine locally. However, they are still unsure why the Docker version of the retriever is failing in this way.
After some investigation, a community member suggests that the issue may be related to the query being passed in the Docker container, as it is not a string but a Chainlit message that needs to be unwrapped. This turns out to be the correct solution, as the community members find that pinning Chainlit to a specific patch version resolves the issue
I.e. embedding_model.get_text_embedding(documents[0].text) is returning a value, however, VectorStoreIndex.from_documents when queried via index.as_retriever().retrieve('foo')[0].embedding is null/empty
It's not that it's empty, it's just the llama-index is not populating the embedding field of the node when retirveing (the embedding is stored separately from the node)
There code be a PR to attach the embedding to the node in the vector db you are using
| result = self.index.as_retriever(similarity_top_k=10).retrieve(query) | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | File "/opt/conda/lib/python3.11/site-packages/llama_index/indices/base_retriever.py", line 22, in retrieve | return self._retrieve(str_or_query_bundle) | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | File "/opt/conda/lib/python3.11/site-packages/llama_index/indices/vector_store/retrievers/retriever.py", line 81, in _retrieve | if query_bundle.embedding is None and len(query_bundle.embedding_strs) > 0:
Wow - indeed this was the right pointer. I have no idea why inside docker the message is not a string (it ineed is a string in the local setup) but it is a chainlit message which needs to be unwrapped. @disiok thanks!