Find answers from the community

Updated 3 months ago

Graph query

I created a function for query (instead of lambda function). How do I see which exact index being used?

Plain Text
        def queryFunc(query: str) -> str:
            response = index.query(query, similarity_top_k=3)
            print(response.source_nodes)
            return response 

        tools.append(Tool(
            name=f"Vector Index {index_name}",
            func=queryFunc,
            description=f"use this tool to answer questions solely about {company_name} {document_type}, year {document_year}. Do not use this tool for comparison with other documents/companies/reports.",
            return_direct=True
        ))
L
p
47 comments
Did you create a different function for each index? From what i can tell, this is querying the same index if you have multiple tools
@Logan M Yes, a new function is created for each index (I modified the code abit):

Plain Text
    for index_name, index in index_set.items():
        index_name_split = index_name.split("$$$")
        company_name, document_type, document_year = index_name_split[0], index_name_split[1], index_name_split[2]

        def query(query: str) -> str:
            response = index.query(query, similarity_top_k=3)
            print(response.source_nodes)
            return response

        tools.append(Tool(
            name=f"Vector Index {index_name}",
            func=query,
            description=f"use this tool to answer questions solely about {company_name} {document_type}, year {document_year}. Do not use this tool for comparison with other documents/companies/reports.",
            return_direct=True
        ))
    agent_chain = initialize_agent(tools, llm, agent="conversational-react-description", memory=memory, verbose=True)
basically it is the same query function, but I'm creating it dynamically in for loop.
My python sense is tingling here. I think creating the function defs in a loop like that is causing issues.

Can you try a slightly different approach? Add the index as an argument to the query function, and move the function def out of the loop

Then when you create the tool, rather than passing the function itself, create a wrapper lambda, something like func=lambda prompt: str(query(index_set[index_name], prompt)),
I put the function outside the for loop, but the issue remains:

In the logs it shows that it is using Vector Index Deutsche Bank (which is correct)

but in the breakpoint it is using a different index
Attachment
image.png
Plain Text
    def query(index: str, prompt: str) -> str:
        response = index.query(prompt, similarity_top_k=3)
        print(response.source_nodes)
        return response

    tools = []

    for index_name, index in index_set.items():
        index_name_split = index_name.split("$$$")
        company_name, document_type, document_year = index_name_split[0], index_name_split[1], index_name_split[2]

        tools.append(Tool(
            name=f"Vector Index {index_name}",
            func=lambda prompt: str(query(index_set[index_name], prompt)),
            description=f"use this tool to answer questions solely about {company_name} {document_type}, year {document_year}. Do not use this tool for comparison with other documents/companies/reports.",
            return_direct=True
        ))
    # tools.append(Tool(
    #     name=f"Graph Index",
    #     func=lambda q: str(graph.query(q, query_configs=query_configs)),
    #     description="use this tool when you want to answer queries that require analyzing multiple documents/companies/reports. If they mention comparing more than one source, use this tool.",
    #     return_direct=True
    # ))

    agent_chain = initialize_agent(tools, llm, agent="conversational-react-description", memory=memory, verbose=True)
wait... I'm sending a string index_set[index_name] , how can I even do index.query(prompt, similarity_top_k=3) ???
You can still do that in your query function.

index_set[index_name] should be mapped to the actual index right? And you pass the index into the query function, so then inside there you can still add the top_k and whichever other options you need
At this point, it's either a bug with langchain or a bug with how it's being used. :consequences:

If it were me, I would set a breakpoint just before running the agent chain, and step into all the langchain code to see where the mis-alignment is happening. Should be easy enough since you are using pycharm already
Alright, let me debug. Btw, is there a way to know the name of the index being used, instead of investigating the object? I use the summary variable to know which index is being used.
Attachment
image.png
Summary is a good thing to look at. I thiiiiink you can also set/check index.index_struct.index_id
'f0f69190-f136-42ec-8970-160ad2b5b11c' , not informative enough 😁

I was thinking something like:

Plain Text
index.name
> Vector Index UBS


the same name, I used in :

Plain Text
Tool(
            name=f"Vector Index {index_name}",
            func=lambda prompt: str(queryFunc(index, prompt)),
            description=f"use this tool to answer questions solely about {company_name} {document_type}, year {document_year}. Do not use this tool for comparison with other documents/companies/reports.",
            return_direct=True
        )


Ah.. but this is a tool, not an index. I'm kinda confused πŸ™‚
@Logan M On query time, I can see that the agent contains both tools. The issue is happening during agent_chian.run(input=input part. Is there a way to unwrap the run function and use another step by step approach?
Attachment
image.png
Are you not able to step into the run function?
@Logan M no, unfortunately, PyCharm doesn't allow me to place a breakpoint inside the non-project code. I can only place a breakpoint inside my code.
Right, but the debugger has a "step into" "step over" function, allowing you to either follow execution through functions, or continue in the current file line by line
@Logan M Oh. Didn't know about that function. Let me see..
Super powerful feature! I don't have pycharm open, but I trust you'll find the buttons πŸ™
@Logan M I investigated the source code of the run function and here what I found:

  1. It works okay up to Tool class -> _run function. There it has return self.func(tool_input) . That func calls:
lambda prompt: str(queryFunc(idx, prompt)), in my tool.

And here, idx is not the index that I want, but an index of another tool. That's where the issue arises.
Isn't that because I'm creating the tools in a loop?

Plain Text
for index_name, idx in index_set.items():
            tools.append(Tool(
                name=f"Vector Index {index_name}",
                func=lambda prompt: str(queryFunc(idx, prompt)),
                description=f"descr",
                return_direct=True
            ))
so since the func in Langchain is not sending an index, I guess that the index that is being sent to my queryFunc is just a last index in the loop.
I think everything apart from the prompt itself, is being gathered dynamically and it just plugs in the last idx in the loop.
Or maybe I'm mistaken.. what do you think?
That could be it πŸ€” maybe some python pass-by-reference crap is happening lol

Maybe try creating your tools without the loop? (It would be super hard coded and manual, but just a test)
@Logan M Yes, I will try to create the Tools without a loop now. However, even If it will work, I can't create them like that when the number of documents will grow :/
There must be a solution πŸ€” but let's see if it works without the loop first, then we can decide if the loop was the issue and how to fix it
@Logan M Yes, that is because of the loop 😁

Plain Text
index1 = index_set["Deutsche Bank$$$Annual Report$$$2022"]
index2 = index_set["Credit Suisse$$$Annual Report$$$2022"]


    tool1 = Tool(
        name=f"Vector Index Deutsche Bank Annual Report 2022",
        func=lambda prompt: str(queryFunc(index1, prompt)),
        description=f"desc",
        return_direct=True
    )
    tool2 = Tool(
        name=f"Vector Index Credit Suisse Annual Report 2022",
        func=lambda prompt: str(queryFunc(index2, prompt)),
        description=f"desc1",
        return_direct=True
    )

    tools = [tool1, tool2]


This time it is passing the correct index
Wow! So, we just need to figure out how to properly make the tools in the loop then πŸ’ͺ
Yeah πŸ™‚

need to refresh my memory on how to work with this stuff. I don't even remember how it's called (dynamic function call?) What was it? πŸ™‚
Well, I think the original loop had some pass-by-reference issues, just need to make sure we avoid that when using the loop πŸ€”
Python is tricky sometimes haha
True πŸ™‚ Let me figure out how can I overcome this issue.

I’m wondering tho why no-one raised this issue before. I mean, isn’t that a common thing - to create multiple tools in a loop?
What if you use all optional parameters on the query function?

def query(prompt, index_name=None)

Then
func=lambda prompt: str(query(prompt, index_name=index_name))

I'm not sure how other people avoid this hahaha maybe there is some obvious error we are both missing
yeah, that's what I'm basically doing
Plain Text
def queryFunc(idx: GPTSimpleVectorIndex, prompt: str) -> str:
    response = idx.query(prompt, similarity_top_k=3)
    print(response.source_nodes)
    return response


for index_name, idx in index_set.items():
        index_name_split = index_name.split("$$$")
        company_name, document_type, document_year = index_name_split[0], index_name_split[1], index_name_split[2]

        tools.append(Tool(
            name=f"Vector Index {index_name}",
            func=lambda prompt: str(queryFunc(idx, prompt)),
            description=f"desc.",
            return_direct=True
        ))
Right, but rather than passing in the entire index object, what if we just pass in the key to the index_set dictionary?
It is passing the last index_name in the loop..
πŸ€”πŸ€”πŸ€”
Yahoo! I fixed it (Probably) πŸ˜„
Let's goooooo:dotsCATJAM: :dotsHARDSTYLE:
lambda prompt=index_name: str(queryFunc(index_name, prompt))


----became----

lambda prompt, index_name=index_name: str(queryFunc(index_name, prompt))
so I added extra index_name=index_name in the lambda and it passes the current index_name and not the last one
Glad you were able to figure it out! πŸ’ͺπŸ’ͺ
Thank you very much @Logan M !!
Add a reply
Sign up and join the conversation on Discord