Gonna try asking here because the bot

MMikeyC - dribgib

Gonna try asking here because the bot didnt understand me. What if I have a folder about a lake and in that folder is seperate text files about the lake itself and its history. It also contains folders for the 5 towns the lake encompasses. In each folder are text files with information about the communities in that town. What kind of index would I use to make sure llamma chain understands the heiarchy. I was told I could DirectoryReader class and add Field_Names that would be embedded with the file to give it context. Just not sure which type of index to use

11 comments

jjerryjliu0

yep! the index kind of depends what questions you want to ask. do you want to ask questions about specific facts? then use a vector index. do you want to perform summarization? then try using a list index

bbSharpCyclist

what if you wanted to support both? I'm playing with a collection of documents as separate vector indexes, and then wrapping that all together in a list index.

LLogan M

You could do some sort of pre-processing on the query using embeddings to detect if the query is asking for a summary or not lol

Another (slower) option is straight-up manually asking the LLM too

once you know if the user wants a query, you can call the appropriate indexes/functions

bbSharpCyclist

well that's where I was going. Do we first try to understand intent, and then try to route to the correct index to use?

LLogan M

That's my thinking 🤔

Actually, If you put a vector/keyword index at the top-level with two sub indexes (one for summaries, like a list index, and the other for normal queries, like an existing composable index), and then with summaries like "This index will provide a summary", it might route it properly. That starts to get a little complex though

bbSharpCyclist

I wish I had a better intuition on how to combine indexes together, to support multiple use cases like summarize and ask a specific question. Take a book as an example, with 5 chapters. I could see one wanting to support an answer to a specific question, and perhaps a summary. What index design structure best supports that?

LLogan M

If you need to handle both queries, I think things will get a little complicated

Here's a rough idea off the top of my head lol I think the response time might be a little slow though, not sure. Probably depends on the number of chapters

Attachment

bbSharpCyclist

Thank you for that! I appreciate you taking the time to draw that up. Lot's to play with. For some things, just simple vector index on the entire book is enough! I also so a use case where they make each document a tree index. When is it best to use that?

LLogan M

Always easier to communicate with an image haha

Trees are a good replacement for list indexes imo. They provide good summarization abilities.

I've also seen some examples with them at the top level of a graph but I haven't played around with it too much yet

bbSharpCyclist

I was actually playing with tree at the top level as well, ok results. But testing is stalled .. Open AI not behaving

bbSharpCyclist

I guess that's where the magic happens .. how you index the data .. and like any index you create, it should support a specific use case

Add a reply

Find answers from the community

Gonna try asking here because the bot