Find answers from the community

Updated 8 months ago

I created a retriever with some filters : filter_retriever.filtersMetadataFilters(filte

@kapa.ai I created a retriever with some filters :

filter_retriever.filters
MetadataFilters(filters=[MetadataFilter(key='file_root', value='Code militaire', operator=<FilterOperator.EQ: '=='>)]

I then try to retrieve from my query :
filter_retriever.retrieve(query)

However the results contains indexes with the metadata 'file_root': 'Code de justice administrative' which doesn't respect the filter.

Why is my retriever not filtering properly on the metadatas?
k
t
L
10 comments
Please stand by @tatanfort as I search the knowledge base. I normally take about 20 seconds or so. :wave:
----------------------
Based on the information provided in the knowledge sources, it's hard to pinpoint the exact reason why your retriever is not filtering properly on the metadata. However, there are a few things you could check:

  1. FilterOperator: Ensure that the FilterOperator is set correctly. In your case, it's set to FilterOperator.EQ which means it should return exact matches. If 'Code militaire' is not exactly the same as 'Code de justice administrative' in your data, it should not be returned.
  1. Metadata Key-Value Pair: Make sure that the key-value pair in your filter matches exactly with the metadata in your data. In your case, the key is 'file_root' and the value is 'Code militaire'. If there's any discrepancy in the key or value, the filter might not work as expected.
  1. Data Consistency: Check if your data is consistent. If there are any inconsistencies or errors in your data, it might affect the filtering process.
  1. Filter Application: Ensure that the filters are being applied correctly when creating the retriever. The filters should be passed as an argument to the as_retriever method of the index.
Here's an example of how to create a retriever with filters:

Plain Text
from llama_index.core.vector_stores import MetadataFilter, MetadataFilters, FilterOperator

filters = MetadataFilters(
    filters=[
        MetadataFilter(key="file_root", operator=FilterOperator.EQ, value="Code militaire"),
    ]
)

retriever = index.as_retriever(filters=filters)


If you've checked all these points and the issue still persists, it might be a good idea to look into the documentation of LlamaIndex or reach out to the community for help.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
@Logan M hey! do you have any idea why the filters of the retriever are not applied even though when I display the retriever filters they actually appear properly ?

Here is my code :

synthetizer_output = preprocess_query(last_user_query = query, client_llm= client_llm, instruction=instruction, last_service_response='')
query_synthetized = synthetizer_output['query_processed']
filters = synthetizer_output['filters']
filters_retriever = [MetadataFilter(key=f, operator=FilterOperator.EQ, value=filters[f]) for f in filters]
if filters :
filters = MetadataFilters(
filters=filters_retriever
)
filter_retriever = self.query_engine
# Set the filters in the retriever
filter_retriever.filters = filters
return filter_retriever.retrieve(query_synthetized)
Plain Text
filter_retriever = self.query_engine
# Set the filters in the retriever
filter_retriever.filters = filters


These two lines don't make sense. A query engine is not a retriever. A query engine also doesn't have a .filters attribute

Create a retriever and pass in the filters

index.as_retriever(filters=filters, ...)
@Logan M I should have copied the line before defining self.query_engine too. It is actually created as you say :

self.vect_from_vect_stor = VectorStoreIndex.from_vector_store(
service_context=service_context, vector_store=vector_store
)
self.query_engine = self.vect_from_vect_stor.as_retriever(
similarity_top_k=6,
node_postprocessors=[self.postproc, self.rerank],
)

But the filters are not applied :/
@Logan M the retriever is well defined from what I see though :

print(filter_retriever)
<llama_index.core.indices.vector_store.retrievers.retriever.VectorIndexRetriever object at 0x2832cd2b0>

print(filter_retriever._filters)
filters=[MetadataFilter(key='file_root', value='Code militaire', operator=<FilterOperator.EQ: '=='>)
What vector store are you using?
Default? Pinecone? Weavitae?
idk what to tell yea, works fine for me on qdrant
Attachment
image.png
Add a reply
Sign up and join the conversation on Discord