LanceDB Metafilter Not Working

At a glance

The community member is trying to add metadata filtering using LanceDB, but is having issues when using MetadataFilters from LlamaIndex with LanceDB. They have provided some example code and are asking for thoughts on what might be causing the issue. The comments suggest that a change made by a LanceDB developer may have broken the integration, which is not used very often. The community member is considering using Postgres instead, but then manages to get the code working in a new virtual environment, so they may have just been experiencing a temporary issue.

Useful resources

PPwnosaurusRex

I'm trying to add metadata filtering using LanceDB. I have it working fine using purely their package as outlined here and here

However if I tried to use MetadataFilters from LlamaIndex with LanceDB I always get no results...Thoughts? Something to do with this section of code ?

Example query...I tried the key metadata.theme as well.

Plain Text

filters = MetadataFilters(
    filters=[
        MetadataFilter(
            key="theme", operator=FilterOperator.EQ, value="Fiction"
        ),
    ]
)

8 comments

PPwnosaurusRex

@Logan M anywhere else I can look to troubleshoot?

LLogan M

I wonder if it has to do with this?

LLogan M

https://github.com/run-llama/llama_index/blob/3bce56d3521bb5b2d722cd4aca15f7973fc175a0/llama-index-integrations/vector_stores/llama-index-vector-stores-lancedb/llama_index/vector_stores/lancedb/base.py#L180

LLogan M

A lancedb dev changed this at one point

LLogan M

he may have broken it lol

LLogan M

(this integration isn't used terribly often)

PPwnosaurusRex

I'll play around with it some more and see if I can fix. Or maybe I should just stop trying new things and use good ol' postgres 😆

PPwnosaurusRex

Hm...let me try again. Just testing in a new venv and it works, so maybe I'm just going crazy 🙂

Plain Text

from llama_index.core import SimpleDirectoryReader, StorageContext, Settings, VectorStoreIndex
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.vector_stores.lancedb import LanceDBVectorStore
import lancedb
from llama_index.core.vector_stores import (
    MetadataFilter,
    MetadataFilters,
    FilterOperator,
)

Settings.embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
documents = SimpleDirectoryReader("./data", filename_as_id=True).load_data()
vector_store = LanceDBVectorStore(uri="./lancedb")
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context
)

# print([x.metadata for x in documents])

filters = MetadataFilters(
    filters=[
        MetadataFilter(key="metadata.file_name", operator=FilterOperator.EQ, value="essay.txt"),
    ]
)

# LlamaIndex test
retriever = index.as_retriever(filters=filters)
retriever.retrieve("discovered is to talk about space aliens") # This now works!?!

Add a reply

Find answers from the community

LanceDB Metafilter Not Working