Graph query

I created a function for query (instead of lambda function). How do I see which exact index being used?

Plain Text

        def queryFunc(query: str) -> str:
            response = index.query(query, similarity_top_k=3)
            print(response.source_nodes)
            return response 

        tools.append(Tool(
            name=f"Vector Index {index_name}",
            func=queryFunc,
            description=f"use this tool to answer questions solely about {company_name} {document_type}, year {document_year}. Do not use this tool for comparison with other documents/companies/reports.",
            return_direct=True
        ))

47 comments

LLogan M

Did you create a different function for each index? From what i can tell, this is querying the same index if you have multiple tools

ppikachu8887867

@Logan M Yes, a new function is created for each index (I modified the code abit):

Plain Text

    for index_name, index in index_set.items():
        index_name_split = index_name.split("$$$")
        company_name, document_type, document_year = index_name_split[0], index_name_split[1], index_name_split[2]

        def query(query: str) -> str:
            response = index.query(query, similarity_top_k=3)
            print(response.source_nodes)
            return response

        tools.append(Tool(
            name=f"Vector Index {index_name}",
            func=query,
            description=f"use this tool to answer questions solely about {company_name} {document_type}, year {document_year}. Do not use this tool for comparison with other documents/companies/reports.",
            return_direct=True
        ))
    agent_chain = initialize_agent(tools, llm, agent="conversational-react-description", memory=memory, verbose=True)

ppikachu8887867

basically it is the same query function, but I'm creating it dynamically in for loop.

LLogan M

My python sense is tingling here. I think creating the function defs in a loop like that is causing issues.

Can you try a slightly different approach? Add the index as an argument to the query function, and move the function def out of the loop

Then when you create the tool, rather than passing the function itself, create a wrapper lambda, something like func=lambda prompt: str(query(index_set[index_name], prompt)),

ppikachu8887867

I put the function outside the for loop, but the issue remains:

In the logs it shows that it is using Vector Index Deutsche Bank (which is correct)

but in the breakpoint it is using a different index

Attachment

LLogan M

:PepeHands:

ppikachu8887867

Plain Text

    def query(index: str, prompt: str) -> str:
        response = index.query(prompt, similarity_top_k=3)
        print(response.source_nodes)
        return response

    tools = []

    for index_name, index in index_set.items():
        index_name_split = index_name.split("$$$")
        company_name, document_type, document_year = index_name_split[0], index_name_split[1], index_name_split[2]

        tools.append(Tool(
            name=f"Vector Index {index_name}",
            func=lambda prompt: str(query(index_set[index_name], prompt)),
            description=f"use this tool to answer questions solely about {company_name} {document_type}, year {document_year}. Do not use this tool for comparison with other documents/companies/reports.",
            return_direct=True
        ))
    # tools.append(Tool(
    #     name=f"Graph Index",
    #     func=lambda q: str(graph.query(q, query_configs=query_configs)),
    #     description="use this tool when you want to answer queries that require analyzing multiple documents/companies/reports. If they mention comparing more than one source, use this tool.",
    #     return_direct=True
    # ))

    agent_chain = initialize_agent(tools, llm, agent="conversational-react-description", memory=memory, verbose=True)

ppikachu8887867

wait... I'm sending a string index_set[index_name] , how can I even do index.query(prompt, similarity_top_k=3) ???

LLogan M

You can still do that in your query function.

index_set[index_name] should be mapped to the actual index right? And you pass the index into the query function, so then inside there you can still add the top_k and whichever other options you need

LLogan M

At this point, it's either a bug with langchain or a bug with how it's being used. :consequences:

If it were me, I would set a breakpoint just before running the agent chain, and step into all the langchain code to see where the mis-alignment is happening. Should be easy enough since you are using pycharm already

ppikachu8887867

Alright, let me debug. Btw, is there a way to know the name of the index being used, instead of investigating the object? I use the summary variable to know which index is being used.

Attachment

LLogan M

Summary is a good thing to look at. I thiiiiink you can also set/check index.index_struct.index_id

ppikachu8887867

'f0f69190-f136-42ec-8970-160ad2b5b11c' , not informative enough 😁

I was thinking something like:

Plain Text

index.name
> Vector Index UBS

the same name, I used in :

Plain Text

Tool(
            name=f"Vector Index {index_name}",
            func=lambda prompt: str(queryFunc(index, prompt)),
            description=f"use this tool to answer questions solely about {company_name} {document_type}, year {document_year}. Do not use this tool for comparison with other documents/companies/reports.",
            return_direct=True
        )

Ah.. but this is a tool, not an index. I'm kinda confused 🙂

ppikachu8887867

@Logan M On query time, I can see that the agent contains both tools. The issue is happening during agent_chian.run(input=input part. Is there a way to unwrap the run function and use another step by step approach?

Attachment

LLogan M

Are you not able to step into the run function?

ppikachu8887867

@Logan M no, unfortunately, PyCharm doesn't allow me to place a breakpoint inside the non-project code. I can only place a breakpoint inside my code.

LLogan M

Right, but the debugger has a "step into" "step over" function, allowing you to either follow execution through functions, or continue in the current file line by line

ppikachu8887867

@Logan M Oh. Didn't know about that function. Let me see..

LLogan M

Super powerful feature! I don't have pycharm open, but I trust you'll find the buttons 🙏

ppikachu8887867

@Logan M I investigated the source code of the run function and here what I found:

It works okay up to Tool class -> _run function. There it has return self.func(tool_input) . That func calls:

lambda prompt: str(queryFunc(idx, prompt)), in my tool.

And here, idx is not the index that I want, but an index of another tool. That's where the issue arises.

LLogan M

Oooooo

ppikachu8887867

Isn't that because I'm creating the tools in a loop?

Plain Text

for index_name, idx in index_set.items():
            tools.append(Tool(
                name=f"Vector Index {index_name}",
                func=lambda prompt: str(queryFunc(idx, prompt)),
                description=f"descr",
                return_direct=True
            ))

ppikachu8887867

so since the func in Langchain is not sending an index, I guess that the index that is being sent to my queryFunc is just a last index in the loop.

ppikachu8887867

I think everything apart from the prompt itself, is being gathered dynamically and it just plugs in the last idx in the loop.

ppikachu8887867

Or maybe I'm mistaken.. what do you think?

LLogan M

That could be it 🤔 maybe some python pass-by-reference crap is happening lol

Maybe try creating your tools without the loop? (It would be super hard coded and manual, but just a test)

ppikachu8887867

@Logan M Yes, I will try to create the Tools without a loop now. However, even If it will work, I can't create them like that when the number of documents will grow :/

LLogan M

There must be a solution 🤔 but let's see if it works without the loop first, then we can decide if the loop was the issue and how to fix it

ppikachu8887867

@Logan M Yes, that is because of the loop 😁

Plain Text

index1 = index_set["Deutsche Bank$$$Annual Report$$$2022"]
index2 = index_set["Credit Suisse$$$Annual Report$$$2022"]


    tool1 = Tool(
        name=f"Vector Index Deutsche Bank Annual Report 2022",
        func=lambda prompt: str(queryFunc(index1, prompt)),
        description=f"desc",
        return_direct=True
    )
    tool2 = Tool(
        name=f"Vector Index Credit Suisse Annual Report 2022",
        func=lambda prompt: str(queryFunc(index2, prompt)),
        description=f"desc1",
        return_direct=True
    )

    tools = [tool1, tool2]

This time it is passing the correct index

LLogan M

Wow! So, we just need to figure out how to properly make the tools in the loop then 💪

ppikachu8887867

Yeah 🙂

need to refresh my memory on how to work with this stuff. I don't even remember how it's called (dynamic function call?) What was it? 🙂

LLogan M

Well, I think the original loop had some pass-by-reference issues, just need to make sure we avoid that when using the loop 🤔

LLogan M

Python is tricky sometimes haha

ppikachu8887867

True 🙂 Let me figure out how can I overcome this issue.

I’m wondering tho why no-one raised this issue before. I mean, isn’t that a common thing - to create multiple tools in a loop?

LLogan M

What if you use all optional parameters on the query function?

def query(prompt, index_name=None)

Then
func=lambda prompt: str(query(prompt, index_name=index_name))

I'm not sure how other people avoid this hahaha maybe there is some obvious error we are both missing

ppikachu8887867

yeah, that's what I'm basically doing

ppikachu8887867

Plain Text

def queryFunc(idx: GPTSimpleVectorIndex, prompt: str) -> str:
    response = idx.query(prompt, similarity_top_k=3)
    print(response.source_nodes)
    return response


for index_name, idx in index_set.items():
        index_name_split = index_name.split("$$$")
        company_name, document_type, document_year = index_name_split[0], index_name_split[1], index_name_split[2]

        tools.append(Tool(
            name=f"Vector Index {index_name}",
            func=lambda prompt: str(queryFunc(idx, prompt)),
            description=f"desc.",
            return_direct=True
        ))

LLogan M

Right, but rather than passing in the entire index object, what if we just pass in the key to the index_set dictionary?

ppikachu8887867

I see. Let me try

ppikachu8887867

It is passing the last index_name in the loop..

LLogan M

🤔🤔🤔

ppikachu8887867

Yahoo! I fixed it (Probably) 😄

LLogan M

Let's goooooo:dotsCATJAM: :dotsHARDSTYLE:

ppikachu8887867

lambda prompt=index_name: str(queryFunc(index_name, prompt))

----became----

lambda prompt, index_name=index_name: str(queryFunc(index_name, prompt))

ppikachu8887867

so I added extra index_name=index_name in the lambda and it passes the current index_name and not the last one

LLogan M

Glad you were able to figure it out! 💪💪

ppikachu8887867

Thank you very much @Logan M !!

Add a reply

Find answers from the community

Graph query