Kapa couldn't handle this one. Can I manually supply ...

At a glance

Kapa couldn't handle this one. Can I manually supply a set of 'nodes' retrieved from a retrieval engine for use in a filtered query engine query? I have something like this:

Plain Text

    nodes = retriever.retrieve(args.query)
    # now filter the nodes somehow, e.g. make sure we use only
    # the 'best' result from each unique document
    # Create a query engine that only searches certain footnotes.
    filtered_query_engine = indexes[args.index].as_query_engine(
        filters=meta_filter
    )
    res = filtered_query_engine.query(args.query)
    print(res.response)

Id like to directly supply a filtered set of nodes so that I can control the set of nodes supplied.

11 comments

kkapa.ai

Please stand by @chsurf as I search the knowledge base. I normally take about 20 seconds or so. :wave:

kkapa.ai

----------------------
Yes, you can manually supply a set of 'nodes' retrieved from a retrieval engine for use in a filtered query engine query. You can do this by first getting the full set of nodes from a query call, and then sending them to a node postprocessor. Finally, you can synthesize a response through a summary index. Here is an example of how you can do this:

Plain Text

from llama_index import SummaryIndex

query_str = "Your query string here"

# Create a query engine
query_engine = index.as_query_engine(similarity_top_k=3, response_mode="no_text")

# Get the initial response
init_response = query_engine.query(query_str)

# Get the nodes from the initial response
resp_nodes = [n.node for n in init_response.source_nodes]

# Create a summary index from the nodes
summary_index = SummaryIndex(resp_nodes)

# Create a new query engine with the node postprocessors
query_engine = summary_index.as_query_engine(node_postprocessors=[node_postprocessor])

# Get the final response
response = query_engine.query(query_str)

In this example, node_postprocessor is a placeholder for the node postprocessor you want to use. You can replace it with the actual node postprocessor you want to use.

This code is based on the examples provided in the LlamaIndex documentation () and GitHub repository ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

LLogan M

This sounds like you want to implement a custom node-postprocessor, and use it in a query engine?

cchsurf

@Logan M maybe that is the right way? basically I just want to add a bit more fine-grained control to which nodes are ultimately used to provide contxt to the LLM

cchsurf

either I need to be able to manually provide my filtered list of nodes and prevent the 'query' operation from running a redundant query against the vector db or i need to modify the query engine class I guess (????) to postprocess the set of nodes returned there during the retrieval stage.

cchsurf

If i just run res = filtered_query_engine.query(args.query) right now, it will always pull all the best matching nodes/segments which might include multiple segments from the same conversation.

cchsurf

i'm having trouble finding the source code for index.as_query_engine() or the correct version of query() which maybe has a signature that would explain this...

cchsurf

Maybe what I actually need is just a custom query engine:

https://gpt-index.readthedocs.io/en/latest/examples/query_engine/custom_query_engine.html

cchsurf

hmm maybe i actually need a custom retriever

LLogan M

Yea, it's either a custom retriever or a custom node-postprocessor I think 👀 Thankfully both are easy to create

cchsurf

@Logan M yeap, they very much were

Add a reply

Find answers from the community

Kapa couldn't handle this one. Can I manually supply a set of 'nodes' retrieved from a r