@Logan M I'm trying to use filtering

At a glance

I'm trying to use filtering_list = MetadataFilters(filters=[
MetadataFilter(key='file_name', operator=FilterOperator.EQ, value=metadata),
MetadataFilter(key='file_name', operator=FilterOperator.EQ, value=metadata),
condition="or")

the "value = metadata" is just the name of the metadata I gave a group of documents in the database to be called to later from the filter. The issue is when my filter faces two different topics and tries to retrieve information it stops working and outputs the wrong response. Is there a way to refine the filter in a way to guarantee it pulls the two different metadatas?

2 comments

LLogan M

I'm not sure I know what you mean?

You have a two filters with an OR -- assuming the vector db supports that, it should work?

BBC

@Logan M Is there a way to check what files are pulled from the filter? or is it only pulling text nodes. I'm implementing this feature in CRAG so the filter is implemented here:
llama-index/packs/corrective_rag/base.py:

Plain Text

def retrieve_nodes(self, query_str: str, **kwargs: Any) -> List[NodeWithScore]:
        """Retrieve the relevant nodes for the query."""
        retriever = self.index.as_retriever(filtering_list=self.filtering_list, **kwargs)
        return retriever.retrieve(query_str)

and then for the as_retriever() method for leveraging the filters I just added the parameter implemented in VectorIndexRetriever() class to one of the parameters as retriever takes advantage of:

Plain Text

def as_retriever(self,filtering_list: Any, **kwargs: Any) -> BaseRetriever:
        # NOTE: lazy import
        from llama_index.core.indices.vector_store.retrievers import (
            VectorIndexRetriever,
        )

        return VectorIndexRetriever(
            self,
            filters=filtering_list,
            node_ids=list(self.index_struct.nodes_dict.values()),
            callback_manager=self._callback_manager,
            object_map=self._object_map,
            **kwargs,
        )

The system works but I don't know if its because my question contains two different Metadata ID's that are being called into the filter

Add a reply

Find answers from the community

@Logan M I'm trying to use filtering_