Hi, I'm trying to do multi-modal RAG

At a glance

Hi, I'm trying to do multi-modal RAG like this:

Plain Text

text_store = LanceDBVectorStore(uri="lancedb", table_name="text_collection")
image_store = LanceDBVectorStore(uri="lancedb", table_name="image_collection")
storage_context = StorageContext.from_defaults(
    vector_store=text_store, image_store=image_store
)
index = MultiModalVectorStoreIndex.from_vector_store(
    vector_store=text_store,
    embed_model=HuggingFaceEmbedding(model_name="BAAI/bge-large-en-v1.5"), # dim = 1024
    image_vector_store=image_store,
    image_embed_model=ClipEmbedding(model_name="ViT-L/14") # dim = 768
)
retriever_engine = index.as_retriever(
    similarity_top_k=2, image_similarity_top_k=2
)

But when I do retrieval, it raises ValueError: Query vector size 1024 does not match index column size 768

How can I modify the dimension for each vector store? I can't find the args for that. Thanks in advance.

11 comments

LLogan M

do you have the full traceback? It should be using clip for image retrieval, and BAAI for text retrieval

ssti_uuuuu_its

I don't because I just switch to use QdrantVectorStore. And yes I used clip and BAAI.

ssti_uuuuu_its

But it works when using Qdrant somehow

ssti_uuuuu_its

🙂

ssti_uuuuu_its

I"m using the same code, just change the VectorStoreIndex to Qdrant

LLogan M