Find answers from the community

s
F
Y
a
P
Updated 2 years ago

i am trying to make a QA model over news

i am trying to make a QA model over news articles with meta data. Right now the best option looks like to use the composable Indices on each article then tie those index them

What format of data is the best for loading them into a a model. I currently have them as a csv. Is it better to keep them as csv or convert to json , Knowledge Graph, let the model scrape them from the URL on its own or is there a better option not listed?
L
K
7 comments
There is a CSVParser within llama_index. I'm not sure how your csv is structured, but you can likely use that to load the data
By default it treats each row as a document
i will check that tomorrow, thanks.
CSVParser is not searchable and i cant find it in the docs. Do u have a handy link to the correct page?
Looks like the SimpleDirectoryReader uses the appropriate parser automatically https://github.com/jerryjliu/gpt_index/blob/aa0092a14420a75e2e4251a2bd8aa1d5f9c28e29/gpt_index/readers/file/base.py

So just point SimpleDirectoryReader to your folder of data and it will work it's magic
Ok, that was so user friendly it threw me. TY, your team is awesome and this this the best help i have ever gotten from a tech discord.
Thanks @Kren happy to help! The more people using llama index, the better it will become πŸ’ͺ
Add a reply
Sign up and join the conversation on Discord