Find answers from the community

Updated 5 months ago

LanceDB Metafilter Not Working

At a glance

The community member is trying to add metadata filtering using LanceDB, but is having issues when using MetadataFilters from LlamaIndex with LanceDB. They have provided some example code and are asking for thoughts on what might be causing the issue. The comments suggest that a change made by a LanceDB developer may have broken the integration, which is not used very often. The community member is considering using Postgres instead, but then manages to get the code working in a new virtual environment, so they may have just been experiencing a temporary issue.

Useful resources
I'm trying to add metadata filtering using LanceDB. I have it working fine using purely their package as outlined here and here

However if I tried to use MetadataFilters from LlamaIndex with LanceDB I always get no results...Thoughts? Something to do with this section of code?

Example query...I tried the key metadata.theme as well.

Plain Text
filters = MetadataFilters(
    filters=[
        MetadataFilter(
            key="theme", operator=FilterOperator.EQ, value="Fiction"
        ),
    ]
)
P
L
8 comments
@Logan M anywhere else I can look to troubleshoot?
I wonder if it has to do with this?
A lancedb dev changed this at one point
he may have broken it lol
(this integration isn't used terribly often)
I'll play around with it some more and see if I can fix. Or maybe I should just stop trying new things and use good ol' postgres πŸ˜†
Hm...let me try again. Just testing in a new venv and it works, so maybe I'm just going crazy πŸ™‚

Plain Text
from llama_index.core import SimpleDirectoryReader, StorageContext, Settings, VectorStoreIndex
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.vector_stores.lancedb import LanceDBVectorStore
import lancedb
from llama_index.core.vector_stores import (
    MetadataFilter,
    MetadataFilters,
    FilterOperator,
)

Settings.embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
documents = SimpleDirectoryReader("./data", filename_as_id=True).load_data()
vector_store = LanceDBVectorStore(uri="./lancedb")
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context
)

# print([x.metadata for x in documents])

filters = MetadataFilters(
    filters=[
        MetadataFilter(key="metadata.file_name", operator=FilterOperator.EQ, value="essay.txt"),
    ]
)

# LlamaIndex test
retriever = index.as_retriever(filters=filters)
retriever.retrieve("discovered is to talk about space aliens") # This now works!?!
Add a reply
Sign up and join the conversation on Discord