When I read in a document in markdown format (originally an annual report in .pdf format) using the following, it turns it into ~100 documents. documents = SimpleDirectoryReader(directory).load_data()
Any idea why this is happening? Some of the documents end up being two words; others end up being 100 words