Find answers from the community

Updated 2 months ago

I want to create a new index from existing nodes that are stored on disk which are not in

I want to create a new index from existing nodes that are stored on disk which are not in memory. How do I load all nodes to create a new index from nodes?
k
B
9 comments
Please stand by @Brainstencil as I search the knowledge base. We are experiencing especially high traffic right now and might take longer than usual. :wave:
----------------------
Hmm, I don't know enough to give you a confident answer yet. However, you can refer to the LlamaIndex documentation for more information on how to work with nodes and indices: https://gpt-index.readthedocs.io/en/latest
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
@kapa.ai how do I load a docstore from disk?
Let me check the knowledge base for you. One moment! :books:
----------------------
To load a docstore from disk, you need to recreate the storage context using the same configuration, specifically by providing the persist_dir parameter. Here's an example using SimpleDocumentStore:

Plain Text
from llama_index.storage.storage_context import StorageContext
from llama_index.storage.docstore import SimpleDocumentStore

storage_context = StorageContext.from_defaults(
    docstore=SimpleDocumentStore.from_persist_dir(persist_dir="<persist_dir>"),
)


Replace <persist_dir> with the directory where your docstore is saved on disk. This will load the docstore into the storage context.

For more information, refer to the Persisting & Loading Data (https://gpt-index.readthedocs.io/en/latest/how_to/storage/save_load.html) section in the LlamaIndex documentation.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
@kapa.ai how do I import the nodes from the docstore as well?
Let me check the knowledge base for you. One moment! :books:
----------------------
To import nodes from the docstore, you can use the get_nodes method provided by the BaseDocumentStore class. Here's an example using the storage_context we created earlier:

Plain Text
node_ids = storage_context.docstore.docs.keys()
nodes = storage_context.docstore.get_nodes(node_ids)


This will retrieve all the nodes from the docstore. You can then use these nodes to create a new index.

For more information, refer to the Document Store (https://gpt-index.readthedocs.io/en/latest/reference/storage/docstore.html) section in the LlamaIndex documentation.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
I think the correct answer is:
storage_context.docstore.add_documents(nodes)
Add a reply
Sign up and join the conversation on Discord