Find answers from the community

Updated 2 months ago

Hey. I'm trying to use Qdrant with

Hey. I'm trying to use Qdrant with InstructorEmbeddings. When I'm trying to set the index with storage_context, it returns a Pydantic error - PydanticSerializationError: Unable to serialize unknown type: <class 'numpy.ndarray'>

Here is my code
Plain Text
from llama_index.embeddings import InstructorEmbedding
from llama_index import (
    VectorStoreIndex,
    ServiceContext,
    SimpleDirectoryReader,
)
from llama_index.storage.storage_context import StorageContext
from llama_index.vector_stores.qdrant import QdrantVectorStore

qdrant = QdrantClient("http://localhost:6333")

embed_model = InstructorEmbedding(model_name="hkunlp/instructor-base")

# load documents
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()

service_context = ServiceContext.from_defaults(llm=None,
    embed_model=embed_model, chunk_size=512
)
vector_store = QdrantVectorStore(client=qdrant, collection_name="paul_graham")
storage_context = StorageContext.from_defaults(vector_store=vector_store)

# This works
index = VectorStoreIndex.from_documents(
    documents, service_context=service_context
)

index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context, service_context=service_context
)
L
E
11 comments
Can you share the full traceback?
Its pretty long.
hmmmm I THINK it's because InstructorEmbedding is returning numpy instead of a plain list of floats
(that traceback was very helpful btw)
let me test that theory
Plain Text
>>> embeds = embed_model.get_text_embedding("Hello world!")
>>> type(embeds)
<class 'numpy.ndarray'>
>>> 
ok, will patch that class then
omg. First time I have reported a bug and it will be shipped to something public.
p.s. these small wins in life πŸ˜‰
Thanks for reporting this!! :dotsCATJAM:
Add a reply
Sign up and join the conversation on Discord