Find answers from the community

Updated 3 months ago

Gonna try asking here because the bot

Gonna try asking here because the bot didnt understand me. What if I have a folder about a lake and in that folder is seperate text files about the lake itself and its history. It also contains folders for the 5 towns the lake encompasses. In each folder are text files with information about the communities in that town. What kind of index would I use to make sure llamma chain understands the heiarchy. I was told I could DirectoryReader class and add Field_Names that would be embedded with the file to give it context. Just not sure which type of index to use
j
b
L
11 comments
yep! the index kind of depends what questions you want to ask. do you want to ask questions about specific facts? then use a vector index. do you want to perform summarization? then try using a list index
what if you wanted to support both? I'm playing with a collection of documents as separate vector indexes, and then wrapping that all together in a list index.
You could do some sort of pre-processing on the query using embeddings to detect if the query is asking for a summary or not lol

Another (slower) option is straight-up manually asking the LLM too

once you know if the user wants a query, you can call the appropriate indexes/functions
well that's where I was going. Do we first try to understand intent, and then try to route to the correct index to use?
That's my thinking πŸ€”

Actually, If you put a vector/keyword index at the top-level with two sub indexes (one for summaries, like a list index, and the other for normal queries, like an existing composable index), and then with summaries like "This index will provide a summary", it might route it properly. That starts to get a little complex though
I wish I had a better intuition on how to combine indexes together, to support multiple use cases like summarize and ask a specific question. Take a book as an example, with 5 chapters. I could see one wanting to support an answer to a specific question, and perhaps a summary. What index design structure best supports that?
If you need to handle both queries, I think things will get a little complicated

Here's a rough idea off the top of my head lol I think the response time might be a little slow though, not sure. Probably depends on the number of chapters
Attachment
image.png
Thank you for that! I appreciate you taking the time to draw that up. Lot's to play with. For some things, just simple vector index on the entire book is enough! I also so a use case where they make each document a tree index. When is it best to use that?
Always easier to communicate with an image haha

Trees are a good replacement for list indexes imo. They provide good summarization abilities.

I've also seen some examples with them at the top level of a graph but I haven't played around with it too much yet
I was actually playing with tree at the top level as well, ok results. But testing is stalled .. Open AI not behaving
I guess that's where the magic happens .. how you index the data .. and like any index you create, it should support a specific use case
Add a reply
Sign up and join the conversation on Discord