hello, is there an example, where the

At a glance

The community member is asking for an example of how to integrate an LLM to generate filtering on metadata values, beyond the provided example of manually passing the filter values. The comments suggest using an "auto retriever" and provide links to relevant documentation. However, the community member faces issues with the maximum token limit and asks about supporting node splitting for long text content. The comments discuss chunking the data before indexing, attaching links to the full article in the metadata, and using the BaseToolSpec to define an API that filters articles based on keywords.

Useful resources

SSMN

hello, is there an example, where the LLM can be integrated to generate the filtering on metadata values? The only example i see is where the user pass the values to the filter eg:

Plain Text

from llama_index.vector_stores.types import ExactMatchFilter, MetadataFilters

filters = MetadataFilters(
    filters=[ExactMatchFilter(key="type", value="fruit")]
)

15 comments

LLogan M

What you describe is an auto retriever

LLogan M

Two ways of doing it
https://docs.llamaindex.ai/en/stable/examples/agent/openai_agent_query_cookbook.html#autoretrieval-from-a-vector-database

https://docs.llamaindex.ai/en/stable/examples/vector_stores/elasticsearch_auto_retriever.html#define-vectorindexautoretriever

SSMN

thanks! i'll give a try

SSMN

@Logan M is there a way also to support node splitting in case the text is long?

LLogan M

In case which text is long?

SSMN

the content of a TextNode that i'm idexing

SSMN

(I'm using a single TextNode for a single blog post)

SSMN

also when using the filtering technique illustrated in your first link i have some issues:

Plain Text

Got output: Error: Error code: 400 - {'error': {'message': 'max_tokens is too large: 8192. This model supports at most 4096 completion tokens, whereas you provided 8192.', 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': None}}

LLogan M

Should probably be chunking your data before indexing it?

SSMN

but then will be possible to re-obtain the original article?

LLogan M

If you need the full article, you can attach some link to it in the metadata (i.e. a file path, or a URL)

LLogan M

Or you can chunk the data after retrieving

SSMN

i'm looking on how to use BaseToolSpec, probably this is the best solution

SSMN

by defining under the hood an API that filter first the articles

SSMN

based on some keywrods

Add a reply

Find answers from the community

hello, is there an example, where the