Find answers from the community

Updated 2 months ago

Hi I am trying to use Recursive

Hi, I am trying to use Recursive Retriever + Document Agent. (https://gpt-index.readthedocs.io/en/latest/examples/query_engine/recursive_retriever_agents.html)

My final query engine can retrieve the correct node with id and thus find the correct agent. However, the agent didn't even choose which QueryEngineTool to use. Why is that?
Attachment
image.png
L
D
U
20 comments
You can see the tool used (if any) if you set verbose=True on the agent

It looks like it called the underlying tool/query engine to me though. But likely the actual query to the query engine was too specific and the LLM got a little confused (If I had to guess anyways)
got it. Let me try
Hey Logan you're right. The agent has actually called the summary_tool. But what I don't understand is why it cannot answer such a simple query?
Attachment
image.png
I have already defined the description in the summary_index_engine, saying that it's "Useful for summarization questions related to this legal case with case number: DCPI002109_2013
Attachment
image.png
To give you more context, I am building a chatbot for lawyers so they can ask questions about different legal cases
So in the QueryEngineTool of the summary index, the query_engine is actually querying the index that made of the docs of a case
this is my code, if needed
So the query engine is completely separate from the agent

If you look at the input, all it queried with was the case number. The query engine is going to have no idea what that means. I would write descriptions that maybe encourage the agent to write better queries for the tools
Okok! Thanks. Let me look into it
hey @Logan M actually is there a documentation or video or tutorial that explains agents in depth? I only learnt from you just now that the "input" is actually written by the agent. But how does the description help in writing better "input"?

In the "Recursive Retriever + Document Agents" guide, in the last example, the query is ""Give me a summary on all the positive aspects of Chicago""

The "input" is "positive aspects of Chicago"

The description for the summary tool is f"Useful for summarization questions related to {wiki_title}"

  1. the description here is also pretty short
  2. I fail to see how the description in the summary tool can help the agent write a better input in the user's query. Perhaps I don't understand agent enough
I thought agent is just a decision maker to decide which query_engine to use. I didn't know it write the "input" part as well after choosing which query_engine to use
I added "Use a detailed plain text question as input to the tool." in both description
And the input did become a full sentense
but the answer is still wrong
Attachment
image.png
Hmm. If it's going to insist on including the case number in the query, then I would update the system prompt for the query engine tools to give the LLM reference to what case it has data for

For example

Plain Text
case_1_context = ServiceContext.from_defaults(..., system_prompt="You are a Q&A bot who has access to information about case number 1")
case_1_query_engine = index.as_query_engine(..., case_1_context)
So how an agent works is it looks at the chat history and list of tool names+descriptions, and decides which tool to use and writes the query for it.

Then, the tool (a query engine in this case, but it could be any function) runs completely standalone -- all it sees is the input
I added these service_context, but still the answer is bad.
Attachment
image.png
idk man. Maybe try putting the service context in the from_documents() call instead. Or try using a different LLM πŸ€·β€β™‚οΈ gpt-4 or gpt-3.5-turbo-instruct

You can also debug what the LLM sees by

Plain Text
import openai
openai.log = "debug"
hey Logan, by adding service context into from_documents, it works now!
But I do have a separate challenge now, which might be a very common one:

I am trying to build a chatbot for lawyers in Hong Kong to help them speed up their process of legal research, aka finding relevant past cases.

So laywers may ask things like:

"Find me 4 cases about work injuries that involve slipping in a shopping mall."
"Find me 5 cases in which the plaintiff suffers psychologically such as PTSD and depression."
"If my client is speeding and collide with a jaywalker, is he liable? Find me a few cases and answer based on them."

There are more than 100k+ cases and some of them can be more than 40 pages of pdf.

My challenge is how do I summarise each of them in a concise way without leaving out important information so I can pass it to the "text" property of its IndexNode.

How do you recommend me to tackle this problem? Thanks!
I want something similar!
Add a reply
Sign up and join the conversation on Discord