Find answers from the community

Updated 2 months ago

2) when using a lllamaindex doc loader I

2) when using a lllamaindex doc loader I want to see what files loaded or did not load. What log settings etc is best for this. For example I used an .md loader in a directory of sub directories and it loaded 80% I need to know what skipped and why
L
d
M
6 comments
it will always load everything, and print things that were skipped. You can check the document.metadata of each loaded document object for more details
What if each file has same name in different sub directories? I did find 120 but parsed 80 saw no skipped ones listed
it has the full path
Plain Text
>>> from llama_index.core import SimpleDirectoryReader
>>> documents = SimpleDirectoryReader("./docs/docs/examples/data/paul_graham").load_data()
>>> documents[0].metadata
{'file_path': '/Users/loganmarkewich/giant_change/llama_index/docs/docs/examples/data/paul_graham/paul_graham_essay.txt', 'file_name': 'paul_graham_essay.txt', 'file_type': 'text/plain', 'file_size': 75042, 'creation_date': '2024-04-16', 'last_modified_date': '2024-04-16'}
>>> 
I was using .MD loader from hub because even fewer seen by SimpleDirectoryReader I will try again
try recursive=True, in simpledirectoryreader method
Add a reply
Sign up and join the conversation on Discord