The community members are discussing the best node parser to use for Markdown files with tables. They are comparing MarkdownElementNodeParser and MarkdownNodeParser. The key differences are:
MarkdownElementNodeParser uses an LLM (Large Language Model) to create a summary of the extracted content, which gives it an extra edge. In contrast, MarkdownNodeParser only converts the Markdown content into nodes.
The community members also discuss how MarkdownNodeParser generates two types of nodes: Text Nodes and Index Nodes corresponding to embedded objects (e.g., tables). They are curious about how this parser handles tables in the Markdown document and whether it creates a node with the "text" of the table and a summary of what it's about using the LLM.
There is no explicitly marked answer in the provided information.
what would be the best node parser to use in the case where I have markdown files with tables inside ? MarkdownElementNodeParser or MarkdownNodeParser ? Can anyone please explain the difference is using these two parsers?
Difference in terms of: Indexing strategies Influence on the retrieval part of the rag pipeline in terms of performance ?
I read it generates two types of nodes: Text Nodes and Index Nodes corresponding to embedded objects (e.g. tables) how does it handle tables in the markdown document? does it create a node with the "text" of the table and a summary of what it's about using the llm ?