Find answers from the community

Updated 2 months ago

Question about retrievers and metadata

Question about retrievers and metadata filters. I'm trying to use metadata filters to get the correct nodes with a query engine because I'm finding the results are wrong a lot. All of my source documents have a metadata key "source" that contains the URL to the document. I tried the following code to implement it in conjunction with a RetrieverQueryEngine but the results don't appear filtered because I'm still getting back nodes from the wrong documents. Can someone let me know if the code is implemented correctly?
filters = MetadataFilters(filters=[ExactMatchFilter(key="source", value="https://msrc.microsoft.com/update-guide/vulnerability/CVE-2023-4351")]) retriever = VectorIndexRetriever( index=vector_store_indicies['msrc_security_update'], similarity_top_k=5, metadata_filters=filters ) query_engine = RetrieverQueryEngine( retriever=retriever, node_postprocessors=[metadata_replace] ) response = query_engine.query( "fully explain with details 'CVE-2023-4351'", )
t
L
8 comments
as a quick update, if I query the chromadb collection directly with the cve string, I get back all the correct nodes:
data = collection.query(query_texts = 'Chromium: CVE-2023-4351 Use after free in Network', n_results=5, where_document={'$contains': 'CVE-2023-4351'}, include=['metadatas', 'distances'])
so Im wondering if I'm misusing or doing something wrong on the Llama Index side of things? Chroma has the correct node data, but for some reason the Llama Index query_engine isn't returning the correct nodes
I think the correct filter option is filters=filters, not metadata_filters
@Logan M thanks so much, that seems to have worked. As an FYI, I got the code snippets from the Llama Index chatbot...it seems to give wrong code examples a lot. Although I'm a total newbie, I would be happy to volunteer to do some legwork for the chatbot if it'll help improve its responses.
lol we actually dont maintain that chatbot. The one on discord is made by kapa.ai, the one on the docs is made by mendable
lol...alrighty then
Add a reply
Sign up and join the conversation on Discord