Find answers from the community

Updated 3 months ago

Im trying to use MetadataFilters to

Im trying to use MetadataFilters to retrieve documents from an index that have a date inside a given interval:

Plain Text
start_date = start_date.strftime("%Y-%m-%d")
end_date = end_date.strftime("%Y-%m-%d")

filters = MetadataFilters(
    filters=[
        MetadataFilter(key="last_modified_date", operator=FilterOperator.GTE, value=start_date),
        MetadataFilter(key="last_modified_date", operator=FilterOperator.LTE, value=end_date),
    ],
    condition=FilterCondition.AND,
)

retriever = index.as_retriever(similarity_top_k=k, metadata_filters=filters)
documents = retriever.retrieve(query)


However, this is not working properly, as I am getting results of documents with dates outside the interval.

Why is this and how can I fix it?
L
O
14 comments
gt / lt really only works for numbers, not dates
Most vector stores don't seem to have a date filter tbh, but I haven't dug around too much
Is there a way to accomplish this or smth similar?
I thought about making a copy of the index and deleting the documents that do not match, then retrieving against the copy, but its too slow
maybe retrieve a large top k, order by date, and return the top from the date range?
using a custom node postprocessor
Oh I already have a function to do that
the problem is that it doesn't really work well for time intervals in the past that have already ended
Nor in situations where belonging to a date interval is more important than similarity
But I'm comparing them as strings though
"%Y-%m-%d" strings can be compared directly, as string comparison yields the same result as date comparison for this format. Its also how they're stored in the metadata
@Logan M
The crazy thing is that this:
Plain Text
retriever = VectorIndexRetriever(
    index,
    similarity_top_k=k,
    doc_ids=considered_ids
)

documents = retriever.retrieve(query)

Is not working either. Its returning documents with ids outside of doc_ids

I already filed a Bug Report.
It might depend on the vector store you are using tbh
Add a reply
Sign up and join the conversation on Discord