I'm getting some weird behaviour from SimpleDirectoryReader() with llamaparse and wondering if it's intentional. When I load just one file I am ending up with multiple document objects.
parser = LlamaParse(
result_type="markdown",
verbose=True,
)
file_extractor = {".pdf": parser}
document = SimpleDirectoryReader(
input_files=[pdf_path], # pdf_path is ONE file path. ie. './easy_data/example_file.pdf'
file_extractor=file_extractor,
filename_as_id=True,
).load_data(show_progress=True)
however, when I run len(document) i am getting a number > 1, which doesn't make sense. Any ideas what's going on?