Find answers from the community

Updated 2 years ago

Loading from storage

At a glance

The post is about an AttributeError related to the 'index_structs' attribute of a 'dict' object. Community members discuss the issue, with one suggesting providing the full stack trace to better understand the problem. Another community member shares a working example using the GPTListIndex and SimpleDirectoryReader classes from the llama_index library.

The discussion then focuses on reusing nodes across multiple index structures and persisting multiple indices to the same storage. A community member notes that the documentation covers reusing nodes, but does not explain how to load from storage. Another community member suggests persisting each index separately, unless the indexes are very large. The maintainer of the llama_index library then provides guidance on reusing nodes and persisting multiple indices to the same storage, including a link to a relevant notebook example.

Useful resources
Plain Text
AttributeError: 'dict' object has no attribute 'index_structs'
L
j
j
8 comments
What's the full stack trace? Easier to see what's going on with that
Actually I'll also make a quick example that works, maybe that will help to align with what's going on
This flow seems to work fine for me (v0.6.1)

Plain Text
>>> from llama_index import GPTListIndex, SimpleDirectoryReader
>>> documents = SimpleDirectoryReader("./data").load_data()
>>> index = GPTListIndex.from_documents(documents)
>>> index.storage_context.persist(persist_dir="./my_index")
>>> import os
>>> os.listdir("./my_index")
['docstore.json', 'index_store.json', 'vector_store.json']
>>> from llama_index import StorageContext, load_index_from_storage
>>> storage_context = StorageContext.from_defaults(persist_dir="./my_index")
>>> new_index = load_index_from_storage(storage_context)
>>>
I would like to reuse the node across multiple index
Plain Text
     
index1 = GPTVectorStoreIndex(nodes,
service_context=service_context) 
index2 = GPTListIndex(nodes,
service_context=service_context)
How do I get the nodes from the storage_context?
yes but it does not explain how to load from storage. I have a firs step where I load, parse and store finishing by
Plain Text
    storage_context_node.persist(persist_dir="./storage/" + directory)
then I want to reuse this stored info to query from a slack bot or a web bot or a cli bot and so I need to load from the storage, recreate the nodes to create then the index for each bot
Hmm, tried a few things. Thought I had it but then the index_struct was empty πŸ™ƒ

This probably needs some better UX. @jerryjliu0 Is it possible to call persist but then use that across different index types? Or something similar? Doesn't seem so straightforward at the moment πŸ€” Seems like you always need the original nodes or documents in the examples.

@jerome I would just persist each index separately at the moment. Shouldn't be a huge deal at the moment unless your indexes are many GBs
@jerome you can reuse nodes from index structures, and also persist different indexes to the same storage. @Logan M i agree that we could make this more clear

Add a reply
Sign up and join the conversation on Discord