Graph query

At a glance

The community members are discussing an issue with a query that uses about 8 nodes. The graph structure contains two indices, one GPTListIndex and one GPTSimpleVectorIndex. The community members suggest that the query is querying 3 nodes from each vector index, plus the two summaries of each index, resulting in 8 total chunks of text. They also discuss issues with the refine process, which seems to be a common problem with GPT-3.5 lately. One community member shares a refine template they have been working on, which may help. The community members also discuss how the graph queries the question when it contains two indices, and whether combining all documents into one index could be a solution.

JJW

I add there

Plain Text

{
            "index_struct_type": "simple_dict",
            "query_mode": "default",
            "query_kwargs": {
                "similarity_top_k": 3,
            },
        }

But it seems the query use about 8 nodes.

18 comments

LLogan M

Right, it also depends what the top level indexes are too

What is your graph structure?

JJW

Plain Text

 graph = ComposableGraph.from_indices(
            GPTListIndex,
            index_arr,
            index_summaries=summaries,
            service_context=self.service_context
        )

It contains two indices

JJW

My index is GPTSimpleVectorIndex containing several documents GPTSimpleVectorIndex.from_documents(documents, service_context=self.service_context)

LLogan M

Right, so it's going to end up querying 3 nodes from each vector index, plus the two summaries of each index get used

So 8 total chunks of text

JJW

I see these two answer node in the source_node:

Plain Text

[
    {
        "node":
        {
            "doc_hash": "5246e5461ab428ab6238789afd48826009d87ccf39a43750885918911215471d",
            "doc_id": "ab561760-1429-4f8a-b1b5-08611391a1a3",
            "embedding": null,
            "extra_info": null,
            "node_info": null,
            "relationships":
            {},
            "text": "This seems to be the correct answer"
        },
        "score": null
    },
    {
        "node":
        {
            "doc_hash": "ea8067f36510cf58342c3b80cb01c8be736745819d21a65c974274e56ca18dc2",
            "doc_id": "29cddf2b-cbf8-4ed9-9dd2-6a2c77168754",
            "embedding": null,
            "extra_info": null,
            "node_info": null,
            "relationships":
            {},
            "text": "Sorry, there is still no information provided in the given context about the question"
        },
        "score": null
    }
]

and the final response is Original answer still stands as the new context does not provide any information

LLogan M

That response is extremely common from gpt3.5 lately 😔😔😔 but maybe a fix coming soon, it's a major prompt engineering problem

JJW

The first node's answer seems correct but when It tries to refine, it become nothing.

LLogan M

Yup, I know. OpenAI recently downgraded (I.e. updated) gpt 3.5 and it is causing tons of problems with the refine process

LLogan M

I can share a refine template I've been working on, if you want to try passing it in. It may help

JJW

I wonder how graph query the question when it contain two index. Does it query on each index and get two summary answer, and then refine the two answers together ?

LLogan M

Exactly, you got it

JJW

I see, maybe, in my case, I can combine all document into one index.

JJW

If the graph contain only one index, it won't refine the answer, right ?

LLogan M

Since the top_k is 3, it will still refine :PSadge:

Maybe try this first

Plain Text

from langchain.prompts.chat import (
    AIMessagePromptTemplate,
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
)

from llama_index.prompts.prompts import RefinePrompt

# Refine Prompt
CHAT_REFINE_PROMPT_TMPL_MSGS = [
    HumanMessagePromptTemplate.from_template("{query_str}"),
    AIMessagePromptTemplate.from_template("{existing_answer}"),
    HumanMessagePromptTemplate.from_template(
        "I have more context below which can be used "
        "(only if needed) to update your previous answer.\n"
        "------------\n"
        "{context_msg}\n"
        "------------\n"
        "Given the new context, update the previous answer to better "
        "answer my previous query."
        "If the previous answer remains the same, repeat it verbatim. "
        "Never reference the new context or my previous query directly.",
    ),
]


CHAT_REFINE_PROMPT_LC = ChatPromptTemplate.from_messages(CHAT_REFINE_PROMPT_TMPL_MSGS)
CHAT_REFINE_PROMPT = RefinePrompt.from_langchain_prompt(CHAT_REFINE_PROMPT_LC)
...
query_configs = [
  {
     "index_struct_type": "simple_dict",
     "query_mode": "default",
     "query_kwargs": {
           "similarity_top_k": 3,
           "refine_template": CHAT_REFINE_PROMPT 
     },
  },
  {
     "index_struct_type": "list",
     "query_mode": "default",
     "query_kwargs": {
           "refine_template": CHAT_REFINE_PROMPT 
     },
  }
]

LLogan M

Just guessing at what your query configs look like, but basically you want to pass in the refine template to the config of both the list a vector index

LLogan M

I've been testing this prompt and it worked well on my test data

LLogan M

(I'm assuimg you are using gpt 3.5 haha)

JJW

Thanks

Add a reply

Find answers from the community

Graph query