Find answers from the community

Updated 2 months ago

Filtering chapter wise

Plain Text
index = VectorStoreIndex.from_vector_store(vector_store=vector_store, service_context=service_context, storage_context=storage_context)
filters = MetadataFilters(filters=[ExactMatchFilter(key="name", value="Council Liquor License Review Committee")])
retriever = index.as_retriever(filters=filters, top_k_similarity=1000000)
all_nodes = retriever.retrieve()
summary_index = SummaryIndex.build_index_from_nodes(all_nodes)


What's the best way to accomplish something similar to this? cc
W
b
6 comments
I was wondering for each chapter, you create a unique piece of text which you add to respective chapter.

Each chapter should have it. That way when you do filtering you can put that into the value and it fetches exactly those nodes and not other chapter nodes.

I'm not sure whether it'll work or not.
πŸ˜…
@bmax in which case you get empty retrieve()? Do you get it for existing metadata value also?
You can't do an empty retrieve but thought it might be nice to be able to just get all nodes matching the filters instead of using embedding? cc @Logan M ?
just adding the same text in .retrieve("") as my meta data filter actually works but not sure that's reliable for other people
here's what I came up with:

Plain Text
index = VectorStoreIndex.from_vector_store(vector_store=vector_store, service_context=service_context, storage_context=storage_context)
filters = MetadataFilters(filters=[ExactMatchFilter(key="name", value="Council Liquor License Review Committee")])
retriever = index.as_retriever(filters=filters, similarity_top_k=1000)
all_nodes = retriever.retrieve("Council Liquor License Review Committee")
nodes = [node.node for node in all_nodes]
summary_index = ListIndex(nodes=nodes, service_context=service_context)
response = summary_index.as_query_engine().query("Please summarize all concise details, what licenes were approved or denied, and include dates if possible")
print(response)
Which I think? works?
Add a reply
Sign up and join the conversation on Discord