Find answers from the community

Updated 3 months ago

Hey @Logan M

Hey

I was following your Bottoms up tutorials on YouTube and I had a couple of questions. So I am doing a POC using a set of docs files that are in markdown. I have close to 3300 files in a nested folder structure. My question is:

  1. At the top level I have 10 folders which broadly classify the category in which the document lies and inside of these 10 folders is a nested folder structure that contains these markdows. In terms of providing better context, does it make sense to delete the nested folder structure inside the 10 folders and have all the markdowns flatly available.
  2. You tested the accuracy of your query engine using GPT4, is there any other way that I can test it out since I don't have access to it right now?
  3. Are there any best practices that you recommend to have a better response accuracy.
W
1 comment
Hey!

  1. You can keep the files in sub dir but you'll have to change the value of recursive to True in case if using SimpleDirectoryReader . It will recursively check for all the subdirectories.
But I think at the end the end it will put all the files together before creating Nodes.

  1. There are lots of open source LLM that has been used by LlamaIndex to validate/verify dataset. You can find the list here: https://docs.llamaindex.ai/en/stable/module_guides/models/llms.html#open-source-llms
  1. Yes there are many good resources: https://discord.com/channels/1059199217496772688/1187700090178109490/1187701427624214588
Take a look here!
Add a reply
Sign up and join the conversation on Discord