Document
from a list of TextNode
? LlamaParse
, where I break them down into a list of nodes using MarkdownNodeParser
, utilising node.metadata['Header_1]
as a way of filtering those nodes by the md headers from my document, and do text amendment.llama-index-core
, node.metadata
dictionary is missing the Header_1
. What I do now is manually add them back, but I'm stuck with a list of updated TextNode
, not knowing how to convert them into a Document
.Now that I have updated llama-index-core, node.metadata dictionary is missing the Header_1
from llama_index.core.schema import TextNode node1 = TextNode(text="<text_chunk>", id_="<node_id>") node1.metadata['Header_1'] = 'ADD_HEARER'
header_path
you can just try accessing this instead i suppose. If there is no header in a section then it is just /
else it is an actual path like for example /1. Introduction/1.1 Subsection
.TextNode
, is there a way to combine them into a Document
object?llama-index-core
module, this wasn't an issue.Document
object, you can do from llama_index.core.node_parser import MarkdownNodeParser parser = MarkdownNodeParser() nodes = parser.get_nodes_from_documents([Document])
nodes
will be a list of TextNode
. Is there a way to combine them back into a Document
object? Like an inverse transform operation.text = '' metadata = [] for node in nodes: text = text + node.text metadata = metadata.append(node.metadata) # now form the document object using the text and metdata doc = Document(text=text, metadata=metadata)
Document
can be added in similar ways as TextNode
.