Hi I am trying to use Recursive

DDokmy

Hi, I am trying to use Recursive Retriever + Document Agent. (https://gpt-index.readthedocs.io/en/latest/examples/query_engine/recursive_retriever_agents.html)

My final query engine can retrieve the correct node with id and thus find the correct agent. However, the agent didn't even choose which QueryEngineTool to use. Why is that?

Attachment

20 comments

LLogan M

You can see the tool used (if any) if you set verbose=True on the agent

It looks like it called the underlying tool/query engine to me though. But likely the actual query to the query engine was too specific and the LLM got a little confused (If I had to guess anyways)

DDokmy

got it. Let me try

DDokmy

Hey Logan you're right. The agent has actually called the summary_tool. But what I don't understand is why it cannot answer such a simple query?

Attachment

DDokmy

I have already defined the description in the summary_index_engine, saying that it's "Useful for summarization questions related to this legal case with case number: DCPI002109_2013

Attachment

DDokmy

To give you more context, I am building a chatbot for lawyers so they can ask questions about different legal cases

DDokmy

So in the QueryEngineTool of the summary index, the query_engine is actually querying the index that made of the docs of a case

DDokmy

this is my code, if needed

DDokmy

thanks!

LLogan M

So the query engine is completely separate from the agent

If you look at the input, all it queried with was the case number. The query engine is going to have no idea what that means. I would write descriptions that maybe encourage the agent to write better queries for the tools

DDokmy

Okok! Thanks. Let me look into it

DDokmy

hey @Logan M actually is there a documentation or video or tutorial that explains agents in depth? I only learnt from you just now that the "input" is actually written by the agent. But how does the description help in writing better "input"?

In the "Recursive Retriever + Document Agents" guide, in the last example, the query is ""Give me a summary on all the positive aspects of Chicago""

The "input" is "positive aspects of Chicago"

The description for the summary tool is f"Useful for summarization questions related to {wiki_title}"

the description here is also pretty short
I fail to see how the description in the summary tool can help the agent write a better input in the user's query. Perhaps I don't understand agent enough

DDokmy

I thought agent is just a decision maker to decide which query_engine to use. I didn't know it write the "input" part as well after choosing which query_engine to use

DDokmy

I added "Use a detailed plain text question as input to the tool." in both description
And the input did become a full sentense
but the answer is still wrong

Attachment

LLogan M

Hmm. If it's going to insist on including the case number in the query, then I would update the system prompt for the query engine tools to give the LLM reference to what case it has data for

For example

Plain Text

case_1_context = ServiceContext.from_defaults(..., system_prompt="You are a Q&A bot who has access to information about case number 1")
case_1_query_engine = index.as_query_engine(..., case_1_context)

LLogan M

So how an agent works is it looks at the chat history and list of tool names+descriptions, and decides which tool to use and writes the query for it.

Then, the tool (a query engine in this case, but it could be any function) runs completely standalone -- all it sees is the input

DDokmy

I added these service_context, but still the answer is bad.

Attachment

LLogan M

idk man. Maybe try putting the service context in the from_documents() call instead. Or try using a different LLM 🤷‍♂️ gpt-4 or gpt-3.5-turbo-instruct

You can also debug what the LLM sees by

Plain Text

import openai
openai.log = "debug"

DDokmy

hey Logan, by adding service context into from_documents, it works now!

DDokmy

But I do have a separate challenge now, which might be a very common one:

I am trying to build a chatbot for lawyers in Hong Kong to help them speed up their process of legal research, aka finding relevant past cases.

So laywers may ask things like:

"Find me 4 cases about work injuries that involve slipping in a shopping mall."
"Find me 5 cases in which the plaintiff suffers psychologically such as PTSD and depression."
"If my client is speeding and collide with a jaywalker, is he liable? Find me a few cases and answer based on them."

There are more than 100k+ cases and some of them can be more than 40 pages of pdf.

My challenge is how do I summarise each of them in a concise way without leaving out important information so I can pass it to the "text" property of its IndexNode.

How do you recommend me to tackle this problem? Thanks!

UUmpqua

I want something similar!

Add a reply

Find answers from the community

Hi I am trying to use Recursive