LlamaIndex

Log inLog into community

Find answers from the community

Updated 7 months ago

How to convert TextNode to IndexNode?

How to convert TextNode to IndexNode?

At a glance

The post asks how to convert a TextNode to an IndexNode in the LlamaIndex library. Community members provide the following steps:

1. Use the from_text_node class method of the IndexNode class, which takes a TextNode and an index_id as arguments and returns an IndexNode.

2. If there are multiple TextNode instances, loop through them and convert each one to an IndexNode.

3. To create a VectorStoreIndex with the IndexNode instances, pass the list of IndexNode instances to the VectorStoreIndex constructor. In some cases, a StorageContext may also need to be provided, especially when using a custom vector store.

4. To retrieve smaller chunks and follow references to bigger chunks, use the SentenceSplitter class to split the base nodes into smaller chunks, and then create IndexNode instances from these chunks, keeping the same node_id to create references to the bigger chunks. During query time, a RecursiveRetriever can be used to retrieve the smaller chunks and follow the references to the bigger chunks.

·

@kapa.ai How to convert TextNode to IndexNode?

k

i

12 comments

Please stand by @ispan3tz as I search the knowledge base. I normally take about 20 seconds or so. :wave:

----------------------
To convert a TextNode to an IndexNode in LlamaIndex, you can use the from_text_node class method provided by the IndexNode class. This method takes a TextNode and an index_id as arguments and returns an IndexNode.

Here is an example of how you can use it:

Plain Text

from llama_index.core.schema import TextNode, IndexNode

# Create a TextNode
text_node = TextNode(text="<text_chunk>", id_="<node_id>")

# Convert TextNode to IndexNode
index_node = IndexNode.from_text_node(node=text_node, index_id="<index_id>")

In this example, replace <text_chunk> with the text content you want to store in the node, <node_id> with the unique identifier for the node, and <index_id> with the unique identifier for the index.

For more details, you can refer to the LlamaIndex API Reference ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

@kapa.ai what if I have multiple nodes?

Let me check the knowledge base for you. One moment! :books:

----------------------
If you have multiple TextNode instances that you want to convert to IndexNode instances, you can simply use a loop to process each TextNode. Here's an example:

Plain Text

from llama_index.core.schema import TextNode, IndexNode

# Assume text_nodes is a list of TextNode instances
text_nodes = [TextNode(text=f"<text_chunk_{i}>", id_=f"<node_id_{i}>") for i in range(10)]

# Convert each TextNode to an IndexNode
index_nodes = [IndexNode.from_text_node(node=text_node, index_id=f"<index_id_{i}>") for i, text_node in enumerate(text_nodes)]

In this example, replace <text_chunk_{i}>, <node_id_{i}>, and <index_id_{i}> with the text content, node identifier, and index identifier for each node, respectively.

For more details, you can refer to the LlamaIndex API Reference ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

@kapa.ai how to create the VectorStoreIndex with these IndexNodes?

Let me check the knowledge base for you. One moment! :books:

----------------------
To create a VectorStoreIndex with IndexNode instances, you can pass the list of IndexNode instances to the VectorStoreIndex constructor. Here's an example:

Plain Text

from llama_index.core import VectorStoreIndex

# Assume index_nodes is a list of IndexNode instances
index = VectorStoreIndex(index_nodes)

In this example, replace index_nodes with your list of IndexNode instances.

Please note that in some cases, you might also need to provide a StorageContext to the VectorStoreIndex constructor, especially if you're using a custom vector store. The StorageContext can be created using the StorageContext.from_defaults method and the custom vector store as an argument.

Here's an example:

Plain Text

from llama_index.core import VectorStoreIndex, StorageContext
from llama_index.vector_store import YourCustomVectorStore

# Assume index_nodes is a list of IndexNode instances
# Assume your_vector_store is an instance of YourCustomVectorStore
storage_context = StorageContext.from_defaults(vector_store=your_vector_store)
index = VectorStoreIndex(index_nodes, storage_context=storage_context)

In this example, replace YourCustomVectorStore with the class of your custom vector store, and your_vector_store with an instance of your custom vector store.

For more details, you can refer to the LlamaIndex API Reference ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

@kapa.ai but how can we retrieve smaller chunks and follow references to bigger chunks?

Let me check the knowledge base for you. One moment! :books:

----------------------
To retrieve smaller chunks and follow references to bigger chunks, you can use the SentenceSplitter class to split your base nodes into smaller chunks, and then create IndexNode instances from these chunks. These IndexNode instances will have the same node_id as the base node, effectively creating a reference to the bigger chunk.

Here's an example:

Plain Text

from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.schema import IndexNode

# Define the sizes for your sub-chunks
sub_chunk_sizes = [128, 256, 512]

# Create SentenceSplitter instances for each sub-chunk size
sub_node_parsers = [SentenceSplitter(chunk_size=c, chunk_overlap=20) for c in sub_chunk_sizes]

# Assume base_nodes is a list of your base nodes
all_nodes = []
for base_node in base_nodes:
    for n in sub_node_parsers:
        # Split the base node into sub-nodes
        sub_nodes = n.get_nodes_from_documents([base_node])
        
        # Create IndexNode instances from the sub-nodes
        sub_inodes = [IndexNode.from_text_node(sn, base_node.node_id) for sn in sub_nodes]
        
        # Add the IndexNode instances to your list
        all_nodes.extend(sub_inodes)
        
    # Also add the original base node to your list
    original_node = IndexNode.from_text_node(base_node, base_node.node_id)
    all_nodes.append(original_node)

In this example, replace base_nodes with your list of base nodes.

During query time, you can retrieve the smaller chunks and follow the references to the bigger chunks using a RecursiveRetriever. The RecursiveRetriever will retrieve the smaller chunks and then follow the references to retrieve the bigger chunks, providing more context for synthesis.

For more details, you can refer to the LlamaIndex API Reference () and the LlamaIndex Examples ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

Add a reply

Sign up and join the conversation on Discord