LlamaIndex

Log inLog into community

Find answers from the community

Updated 2 years ago

High level question on structuring nodes

High level question on structuring nodes

At a glance

·

High level question on structuring nodes/docs. So far, I broke down one 10-k document and set each section as a node along with metadata of that section. But that is for one year and one company. How should I think about the structure if I want to have multiple years and documents? I would like to start simple, but maybe this requires composability.

L

c

u

40 comments

Yea I would make an index for every year/company

Then, use something like a sub-question query engine to appropraitely route/answer queries

There seems to be various approaches on how to tackle multi documents (composable graphs, router query, decompose transform, subquestions). What’s a good way to think about which way to go? And what’s the difference between using DecomposeQueryTransform vs SubQuestionQueryEngine?

Tbh, I would use either sub question or a router.

A router is good for sending the exact use message to one or more indexes. There's also a newer router retriever, which tbh is also neat.

A sub question query engine is good for breaking down a query into multiple different ones, and sending each sub-question to a specific index

For your use case, I would recommend the sub-question query engine. Then, you can compare/contrast stats from different years and companies pretty easily

What’s your anecdotal experience with using sub question query with OSS models? The tutorial clearly shows the OpenAI can do a pretty good job but I’m not sure if the smaller open sourced ones can match

ha ok good question. Open-source is not great, because the sub-question relies on outputting structured JSON, and opensource models kinda suck at doing that consistently

One option is using openai for JUST the sub-question generation, if you are worried about data privacy

IMO using open-source models right now really limits the interesting things you can do with llama-index :PSadge: Unless the opensource model is like 70B parameters lol

Fingers crossed someone is fine tuning an instruct model just for RAG and outputting json!

We are definitely worried about data privacy. How does using OpenAI just on sub-question generation work without any leakage? I would have to ask my question in a way that doesn’t give anything away

Tbh I think the data privacy issue is a tad overblown? But thats just me 😅

The sub question generation is not actually reading the data in your index -- it's purely looking at the user query, and the descriptions of indexes. Fairly minimized impact, but maybe that's still too much

Yeah I don’t disagree with you lol. We have a team of compliance lawyers that are super strict about this. If I were to use OpenAI for just the sub question part, I just have to make sure my question doesn’t give any sensitive information away.

Basically don’t ask it anything I wouldn’t ask Google search I guess

Hey, I have a question and a few follow-up on this. After experimenting different llama2 models on the subquestion generation, your hunch about only 70B can return a valid json was right on. Thought I did run into an error a key error, even tho the tools exists (see image):

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-e857b28a-5e17-49f0-a172-d99ed177bdfd/lib/python3.10/site-packages/llama_index/query_engine/sub_question_query_engine.py:143, in <listcomp>(.0)
140 qa_pairs_all = cast(List[Optional[SubQuestionAnswerPair]], qa_pairs_all)
141 else:
142 qa_pairs_all = [
--> 143 self._query_subq(sub_q, color=colors[str(ind)])
144 for ind, sub_q in enumerate(sub_questions)
145 ]
147 # filter out sub questions that failed
148 qa_pairs: List[SubQuestionAnswerPair] = list(filter(None, qa_pairs_all))

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-e857b28a-5e17-49f0-a172-d99ed177bdfd/lib/python3.10/site-packages/llama_index/query_engine/sub_question_query_engine.py:242, in SubQuestionQueryEngine._query_subq(self, sub_q, color)
237 with self.callback_manager.event(
238 CBEventType.SUB_QUESTION,
239 payload={EventPayload.SUB_QUESTION: SubQuestionAnswerPair(sub_q=sub_q)},
240 ) as event:
241 question = sub_q.sub_question
--> 242 query_engine = self._query_engines[sub_q.tool_name]
244 if self._verbose:
245 print_text(f"[{sub_q.tool_name}] Q: {question}\n", color=color)

Attachment

Should I just not use callback manager?

I haven’t yet tried changing the DEFAULT_SUB_QUESTION_PROMPT_TMPL from prompts.py, but some people are suggesting using the following llama2 prompt format for json:

https://reddit.com/r/LocalLLaMA/s/nt9bFSbuYf

“instruction = B_INST + " Respond to the following in JSON with 'action' and 'action_input' values " + E_INST
human_msg = instruction + "\nUser: {input}"

agent.agent.llm_chain.prompt.messages[2].prompt.template = human_msg”

https://www.pinecone.io/learn/llama-2/

When I directly fed llama2 70B DEFAULT_SUB_QUESTION_PROMPT_TMPL from prompts.py, it gave me this:

Plain Text

[
    {
        "sub_question": "What are the products offered by Penumbra in 2022",
        "tool_name": "Penumbra_2022_10-K"
    },
    {
        "sub_question": "What are the products offered by Inari Medical in 2022",
        "tool_name": "Inari Medical_2022_10-K"
    },
    {
        "sub_question": "How have Penumbra's products changed over time",
        "tool_name": "Penumbra_2021_10-K, Penumbra_2020_10-K"
    },
    {
        "sub_question": "How have Inari Medical's products changed over time",
        "tool_name": "Inari Medical_2021_10-K, Inari Medical_2020_10-K"
    }
]

Oh I get the error now, its giving two tools as one string under tool_name

here is GPT 3.5's response:

Plain Text

[
    {
        "sub_question": "What are Penumbra's products?",
        "tool_name": "Penumbra_2022_10-K"
    },
    {
        "sub_question": "What are Inari Medical's products?",
        "tool_name": "Inari Medical_2022_10-K"
    }
]

Nice debugging! Yea you got it

I suppose the prompt could maybe make it clear only one tool name is needed?

good idea. that sub_question should've been broken into two sub_questions

for question_gen, is the guidance one only supporting open ai models?

Hmm, it's using the guidance package. Looks like they added support for other LLMs though
https://github.com/guidance-ai/guidance

guidance.llms.Transformers("your_path/llama-7b", device=0)

ok cool, that looks pretty robust coming from microsoft

for the compare/contrast sub questions, i see the breakdown to the individual company, but where's the last sub question where it actually compares the two companies

is that up to the lower level LLM to synthesize?

So once it asks all the sub-questions, then it uses your original query and all the responses to sub questions to do the final analysis

oh I see, it just crams all those answers along with the original query to that LLM

thanks so much for you help again. I'll let you know if I have any luck lol

Sounds good! 🙏

silly question but how does the coloring work?

Attachment

lol I have no idea actually, I think we used a util function from langchain for that

GPT3.5 is definitely better at breaking down questions than the 70B llama2 model, so i may just do that for now. But it seems like sometimes, the final synthesizing model is ignoring all the sub-questions answers when giving a final answer:

Attachment

It gave a bunch of revenue related sub-answers, but the final answer has nothing to do with that lol

but there are also times, it was able to gave a logical answer based on the sub-answers

I don’t feel this is true in my experience so far:

https://www.anyscale.com/blog/llama-2-is-about-as-factually-accurate-as-gpt-4-for-summaries-and-is-30x-cheaper?utm_source=gradientflow&utm_medium=newsletter

Hahaha I saw that post today

I thought fat chance 😅

Thanks again for being so patient in answering my questions. I have two more:

Are there use cases where graph makes more sense than subquestion query? I feel like sub-question can handle everything graph does. It really comes down on choosing the type of index and extracting good metadata for each doc.

Maybe this has to do with node post processing. Is there a dynamic way to set similarity_top_k so I always use the maximum that can fit inside the context window?

By graph, I mean like decompose graphs

@captam_morgan I'm currently facing the same situation as your original message so this post has been incredibly helpful to clarify my ideas and experiments. Not sure if it will help you much but I noticed on my experiments that ComposableGraph doesn't play nicely with DecomposeQueryTransform you can see my issues in a question I asked a few days ago here

Thanks for this. Logan said graph is being deprecated for subquestions.

Add a reply

Sign up and join the conversation on Discord