how to get all the nodes from a document in index?

@kapa.ai how to get all the nodes from a document in index?

3 comments

Please stand by @4c_cccc as I search the knowledge base. We are experiencing especially high traffic right now and might take longer than usual. :wave:

kkapa.ai

----------------------
To get all the nodes from a document in LlamaIndex, you can use the NodeParser class. Here's an example using the SimpleNodeParser:

Plain Text

from llama_index.node_parser import SimpleNodeParser

parser = SimpleNodeParser()
nodes = parser.get_nodes_from_documents(documents)

In this example, documents is a list of Document objects. The get_nodes_from_documents method will parse these documents into Node objects, which represent "chunks" of the source documents.

Source: LlamaIndex Usage Pattern - Parse the Documents into Nodes (https://gpt-index.readthedocs.io/en/latest/guides/primer/usage_pattern.html#2-parse-the-documents-into-nodes)
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

LLogan M

@4c_cccc try index.ref_doc_info

This should return a mapping of every ref_doc_id (i.e ingested documents) to the ids of the nodes they were parsed into

index.docstore.docs will just return every node object as well

Add a reply

Find answers from the community

how to get all the nodes from a document in index?