Find answers from the community

Updated last year

Filters

Is there any workaround right now for utilizing more complex filter queries with backends that support them? I'm using OpenSearch and I have complex data where I want to mix queries that filter by date range or gt/lt values, etc, but it appears that the match filters are locked to ExactMatch. Are there any other (lower level approaches are fine) approaches that would help me to do this?
class MetadataFilters(BaseModel):
"""Metadata filters for vector stores.

Currently only supports exact match filters.
TODO: support more advanced expressions.
"""

filters: List[ExactMatchFilter]

@classmethod
def from_dict(cls, filter_dict: Dict) -> "MetadataFilters":
"""Create MetadataFilters from json."""
filters = []
for k, v in filter_dict.items():
filter = ExactMatchFilter(key=k, value=v)
filters.append(filter)
return cls(filters=filters)
L
c
D
18 comments
It's definitely in the backlog to update filter abstractions

You could make a PR to enable passing in your own open-search specific filters though
@Logan M I'd be more than happy to do so; I guess I need to find the right place to apply this. I know how to formulate the search filters directly to opensearch/elastic; these are just generic json objects. The best approach IMO would be a mechanism to just pass these right through as it would leave all the existing flexibility
It looks like the examples are just passing the search query through already:

ExactMatchFilter(key="term", value='{"metadata.is_footnote": "true"}'),\n",
I wonder if it would just work if I passed something like:

"query": {
"range": {
"metadata.general__rating": {
"gt": 3
}
}
}
ExactMatchFilter(key="range", value='metadata.general__rating": {"gt":3}}')
cool this looks pretty workable. and it does look like it is just reconstructing the basic filter
i'll try passing through the alternative. maybe 'ExactMatch' is a bit of misnomer
In this case it looks like its just being abused to support opensearch lol
yes non exact searches work fine. range, fuzzy, all return correctly. thanks for the pointers to the source; should have dug a little further.
To be clear, this worked just fine:
Plain Text
    meta_filter = MetadataFilters(
        filters=[
            ExactMatchFilter(
                key="range",
                value='{"metadata.general__rating": {"gt":5}}'
            )
        ]
    )
 
i assume it will work for all the other non-exact search filters as well; datetime objects, fuzzy stuff. anyway very cool!
Did you had any success with date ranges? Just yesterday I started looking into it and was considering working for the PR @chsurf
@Dima Date ranges work fine for OpenSearch, everything is just pass-through. so something like:
Plain Text
{
  "_source": ["content", "metadata.general__rating"],
  "query": {
    "range": {
      "metadata.general__rating": {
        "gt": "2022-11-20T16:45:00"
      }
    }
  }
}


would be :

Plain Text
    meta_filter = MetadataFilters(
        filters=[
            ExactMatchFilter(
                key="range",
                value='{"metadata.general__rating": {"gt":"2022-11-20T16:45:00"}}'
            )
        ]
    )
 


however note that I don't see any way at present to prefilter the output keys, you get 'everything' back since there is nowhere to put the '_source' component. This requires an alternative ExactMatchFilter; I'd just make a 'PassThroughMatchFilter' instead
and of course this whole concept here is specific to OpenSearch; the dedicated vectordbs don't support these kinds of requests, and the syntax would be completely different for PostgreSQL.
I think it is worth having an alternative though; both OpenSearch and PostgreSQL provide much richer and well supported hybrid interfaces for this kind of work.
Add a reply
Sign up and join the conversation on Discord