Find answers from the community

Updated 2 years ago

i am trying to make a QA model over news

At a glance

i am trying to make a QA model over news articles with meta data. Right now the best option looks like to use the composable Indices on each article then tie those index them

What format of data is the best for loading them into a a model. I currently have them as a csv. Is it better to keep them as csv or convert to json , Knowledge Graph, let the model scrape them from the URL on its own or is there a better option not listed?

7 comments

LLogan M

There is a CSVParser within llama_index. I'm not sure how your csv is structured, but you can likely use that to load the data

LLogan M

By default it treats each row as a document

KKren

i will check that tomorrow, thanks.

KKren

CSVParser is not searchable and i cant find it in the docs. Do u have a handy link to the correct page?

LLogan M

Looks like the SimpleDirectoryReader uses the appropriate parser automatically https://github.com/jerryjliu/gpt_index/blob/aa0092a14420a75e2e4251a2bd8aa1d5f9c28e29/gpt_index/readers/file/base.py

So just point SimpleDirectoryReader to your folder of data and it will work it's magic

KKren

Ok, that was so user friendly it threw me. TY, your team is awesome and this this the best help i have ever gotten from a tech discord.

LLogan M

Thanks @Kren happy to help! The more people using llama index, the better it will become 💪

Add a reply