**MarkdownReader broken?**

At a glance

The community member had an issue with the MarkdownReader not working when using their own set of file extractors. The error message indicated that the MarkdownReader.load_data() method was missing a required positional argument, file. Another community member suggested that the issue was due to not instantiating the readers correctly, and provided the example of using ".md": MarkdownReader() instead of ".md": MarkdownReader. The original community member confirmed that this was the solution to their problem.

jjoey

Solved (thanks !):
You have to instantiate the readers.

Correct: ".md": MarkdownReader(),

Incorrect:".md": MarkdownReader,

MarkdownReader broken?

When I try to use my own set of file_extractors, I get the following error:

Plain Text

Failed to load file /app/data/manual.md with error: MarkdownReader.load_data() missing 1 required positional argument: 'file'. Skipping...

Code:

Plain Text

file_extractor = {
    ".csv": PandasCSVReader,
    ".docx": DocxReader,
    ...
}
SimpleDirectoryReader(
    input_dir=self.knowledge_path,
    file_extractor=file_extractor,
).load_data()

But this goes away if I just use default extractors. Any ideas?

5 comments

jjoey

Plain Text

poetry show llama-index-readers-file
 name         : llama-index-readers-file
 version      : 0.1.33

Plain Text

poetry show llama-index
 name         : llama-index
 version      : 0.10.65

LLogan M

I'm pretty sure you are meant to be instansiating the readers

LLogan M

".md": MarkdownReader(), for example

jjoey

@Logan M that's it! thank you. i was looking at default_file_reader_cls in core/readers/file/base.py and got confused because it just uses the classnames there

LLogan M

yea a tad confusing 😅 I think it goes an initializes the defaults at some point

Add a reply

Find answers from the community

**MarkdownReader broken?**

MarkdownReader broken?