Query

At a glance

Wondering if someone can help me out with the following. I'm using a VectorIndex and querying using a response mode of tree summarize. I'd like to query the index using A, but then use a custom prompt that has context B and a set of instructions C. So I'd like query passing A to get the appropriate nodes, but when generating the response, the context B would be passed but also I can pass C, a custom set of instructions that could be different with each call. I don't need to pass A, I only need that to get the appropriate nodes. Any ideas? Thanks!

23 comments

LLogan M

So basically you want one string to be used for retrieval and another for actually writing the response with the LLM?

You could separate out the retrieval and synthesis steps. But even easier is probably using a query bundle

Plain Text

from llama_index import QueryBundle

query_engine.query(QueryBundle("LLM string", embedding_strs=["retrieval string"]))

bbSharpCyclist

Yup, that's it. Thanks! I'll check out the QueryBundle functionality. Haven't looked at that yet.

LLogan M

It's very hidden tbh 😅

bbSharpCyclist

oh .., lol 🙂

bbmax

@Logan M I was going to go w/ a custom synthezier like we did for my app 😛

bbSharpCyclist

I first tried a custom text_qa_template when querying a vector index in tree summarize mode, but it seems to get ignored. Not sure why yet. I'll have to look at that closer.

bbSharpCyclist

I'm using the example found here, where it creates a custom text_qa_template. I'm doing the same, and passing it, but when looking at the event pairs with the LlamaDebugger, I can see it's not using what I pass in ...

https://github.com/jerryjliu/llama_index/blob/3c427fc727eb3127cd2bcd4931f41b72c3d13ea9/docs/examples/customization/prompts/chat_prompts.ipynb#L48

bbSharpCyclist

But my example is using a response mode of tree_summarize.

bbSharpCyclist

Ahh, I figured out the issue with my code, after debugging through llama indexes code (in factory.py). It appears that if I use tree summarize, I need to pass summary_template, not text_qa_template. Let me give that a try and see if it works. I'd be curious to know any tips/tricks or special setup you might use for debugging? Thanks!

elif response_mode == ResponseMode.TREE_SUMMARIZE:
return TreeSummarize(
service_context=service_context,
summary_template=summary_template,
streaming=streaming,
use_async=use_async,
verbose=verbose,
)

bbmax

Plain Text

logging.basicConfig(format='%(asctime)s - %(message)s',
                    datefmt='%d-%b-%y %H:%M:%S', stream=sys.stdout, level=logging.DEBUG)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

bbmax

is a nice way to debug

bbSharpCyclist

perhaps the docs below should be updated for tree_summarize 🙂 But I'm guessing things are chaning so quickly that it's hard keep up.

https://gpt-index.readthedocs.io/en/stable/core_modules/query_modules/query_engine/response_modes.html#response-modes

bbSharpCyclist

actually, I think there is a bug ...

LLogan M

@bSharpCyclist yea actually for tree summarize, there's a specific prompt template now

LLogan M

"summary_template=..."

bbSharpCyclist

this code below, it doesn't pass summary_template to get_response_syntheizer ...

Plain Text

response_synthesizer = response_synthesizer or get_response_synthesizer(
            service_context=service_context,
            text_qa_template=text_qa_template,
            refine_template=refine_template,
            simple_template=simple_template,
            response_mode=response_mode,
            use_async=use_async,
            streaming=streaming,
        )

LLogan M

https://github.com/jerryjliu/llama_index/blob/3c427fc727eb3127cd2bcd4931f41b72c3d13ea9/llama_index/prompts/default_prompts.py#L113

LLogan M

That's the default

bbSharpCyclist

I tried creating my own and passing it via summary_template, but it doesn't get passed later in retreive_query_engine.py. Im going to hack locally and see if it changes anything ..

bbSharpCyclist

Or maybe I should just create my own response_synthesizer ... sorry, I'll stop bugging you guys ...

LLogan M

no worries! It should be getting passed tbh 🤔 Not 100% what your overall setup looks like at the moment though

bbSharpCyclist

I do wonder if this should be supported. I think you could by updating a few lines of code.

Plain Text

query_engine = index.as_query_engine(response_mode="tree_summarize", 
                                     service_context=service_context,
                                     summary_template=custom_chat_template)

I got around it by doing this

Plain Text

response_synthesizer = get_response_synthesizer(
    response_mode="tree_summarize",
    summary_template=custom_chat_template,
    service_context=service_context,
)

query_engine = index.as_query_engine(response_synthesizer=response_synthesizer)

Now I can see my custom template being used when doing this ...

Plain Text

event_pairs = llama_debug.get_llm_inputs_outputs()
print(event_pairs[2][0])

bbSharpCyclist

Sorry, there was a mistake abouve. The first line should have been

Plain Text

query_engine = index.as_query_engine(response_mode="tree_summarize"

. I was trying to first use it without a response synthesizer.

Add a reply

Find answers from the community

Query