Find answers from the community

Updated 2 years ago

Graph query

At a glance

The community members are discussing an issue with a query that uses about 8 nodes. The graph structure contains two indices, one GPTListIndex and one GPTSimpleVectorIndex. The community members suggest that the query is querying 3 nodes from each vector index, plus the two summaries of each index, resulting in 8 total chunks of text. They also discuss issues with the refine process, which seems to be a common problem with GPT-3.5 lately. One community member shares a refine template they have been working on, which may help. The community members also discuss how the graph queries the question when it contains two indices, and whether combining all documents into one index could be a solution.

I add there
Plain Text
{
            "index_struct_type": "simple_dict",
            "query_mode": "default",
            "query_kwargs": {
                "similarity_top_k": 3,
            },
        }

But it seems the query use about 8 nodes.
L
J
18 comments
Right, it also depends what the top level indexes are too

What is your graph structure?
Plain Text
 graph = ComposableGraph.from_indices(
            GPTListIndex,
            index_arr,
            index_summaries=summaries,
            service_context=self.service_context
        )

It contains two indices
My index is GPTSimpleVectorIndex containing several documents GPTSimpleVectorIndex.from_documents(documents, service_context=self.service_context)
Right, so it's going to end up querying 3 nodes from each vector index, plus the two summaries of each index get used

So 8 total chunks of text
I see these two answer node in the source_node:
Plain Text
[
    {
        "node":
        {
            "doc_hash": "5246e5461ab428ab6238789afd48826009d87ccf39a43750885918911215471d",
            "doc_id": "ab561760-1429-4f8a-b1b5-08611391a1a3",
            "embedding": null,
            "extra_info": null,
            "node_info": null,
            "relationships":
            {},
            "text": "This seems to be the correct answer"
        },
        "score": null
    },
    {
        "node":
        {
            "doc_hash": "ea8067f36510cf58342c3b80cb01c8be736745819d21a65c974274e56ca18dc2",
            "doc_id": "29cddf2b-cbf8-4ed9-9dd2-6a2c77168754",
            "embedding": null,
            "extra_info": null,
            "node_info": null,
            "relationships":
            {},
            "text": "Sorry, there is still no information provided in the given context about the question"
        },
        "score": null
    }
]

and the final response is Original answer still stands as the new context does not provide any information
That response is extremely common from gpt3.5 lately πŸ˜”πŸ˜”πŸ˜” but maybe a fix coming soon, it's a major prompt engineering problem
The first node's answer seems correct but when It tries to refine, it become nothing.
Yup, I know. OpenAI recently downgraded (I.e. updated) gpt 3.5 and it is causing tons of problems with the refine process
I can share a refine template I've been working on, if you want to try passing it in. It may help
I wonder how graph query the question when it contain two index. Does it query on each index and get two summary answer, and then refine the two answers together ?
Exactly, you got it
I see, maybe, in my case, I can combine all document into one index.
If the graph contain only one index, it won't refine the answer, right ?
Since the top_k is 3, it will still refine :PSadge:

Maybe try this first

Plain Text
from langchain.prompts.chat import (
    AIMessagePromptTemplate,
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
)

from llama_index.prompts.prompts import RefinePrompt

# Refine Prompt
CHAT_REFINE_PROMPT_TMPL_MSGS = [
    HumanMessagePromptTemplate.from_template("{query_str}"),
    AIMessagePromptTemplate.from_template("{existing_answer}"),
    HumanMessagePromptTemplate.from_template(
        "I have more context below which can be used "
        "(only if needed) to update your previous answer.\n"
        "------------\n"
        "{context_msg}\n"
        "------------\n"
        "Given the new context, update the previous answer to better "
        "answer my previous query."
        "If the previous answer remains the same, repeat it verbatim. "
        "Never reference the new context or my previous query directly.",
    ),
]


CHAT_REFINE_PROMPT_LC = ChatPromptTemplate.from_messages(CHAT_REFINE_PROMPT_TMPL_MSGS)
CHAT_REFINE_PROMPT = RefinePrompt.from_langchain_prompt(CHAT_REFINE_PROMPT_LC)
...
query_configs = [
  {
     "index_struct_type": "simple_dict",
     "query_mode": "default",
     "query_kwargs": {
           "similarity_top_k": 3,
           "refine_template": CHAT_REFINE_PROMPT 
     },
  },
  {
     "index_struct_type": "list",
     "query_mode": "default",
     "query_kwargs": {
           "refine_template": CHAT_REFINE_PROMPT 
     },
  }
]
Just guessing at what your query configs look like, but basically you want to pass in the refine template to the config of both the list a vector index
I've been testing this prompt and it worked well on my test data
(I'm assuimg you are using gpt 3.5 haha)
Add a reply
Sign up and join the conversation on Discord