I thiiiink you can load the index from

At a glance

I thiiiink you can load the index from storage (as you are doing), call insert_nodes(), and then call persist again to write to disk
index.storage_context.persist(persist_dir=...)

33 comments

sshere

but i thought the storage context contained the index information as well no?

sshere

the new storage context is very confusing to be honest 😦

LLogan M

hmm, ngl, it confuses me a bit too lol

LLogan M

what's the exact issue? I'm a little lost with what the problem is lol

sshere

so i have a list of documents each is a travel excursion, i want load it into a GPTDocumentSummary and then use the summary to retrieve relevant excursions based on the travel location, right now i load the first document

sshere

then when i load the second it doesn't load the summary as well

sshere

i see all of the documents in the doc store but when i do the retrieval it only ever returns the first one or "None"

LLogan M

So step one, you built the index using all your travel documents, right?

sshere

not all just 1 of the docs

sshere

then i iterate through them and add them one by one to the index

LLogan M

Why not build with them all at once, just curious?

sshere

because i need to be able to add more over time

sshere

as new ones get created

LLogan M

I see I see, that makes sense

LLogan M

I think this might be a symptom of how the summary index works, I need to read the source code a bit more lol

sshere

ok so it should work fine with the others but it doesn't. My understanding is that you can define multiple indexes and then save them all together using the storage context

LLogan M

You can, as long as you specify an ID for each index

LLogan M

I still need to try doing that lol

sshere

ok i do that though with no problem

sshere

i tried switching to vector index and now i'm getting a whole different error, could it be because i'm naming my data chunks?

for i, node in enumerate(nodes, start=1):
node.doc_id = f"{uploaded_filename}chunk{i}of{len(nodes)}"

File "/Users/sheresaidon/virtualenvs/bright-black-ai-chat-template/lib/python3.10/site-packages/llama_index/indices/vector_store/retrievers.py", line 90, in _retrieve
node_ids = [
File "/Users/sheresaidon/virtualenvs/bright-black-ai-chat-template/lib/python3.10/site-packages/llama_index/indices/vector_store/retrievers.py", line 91, in <listcomp>
self._index.index_struct.nodes_dict[idx] for idx in query_result.ids
KeyError: '3ddd2b47-9366-4c78-ba28-36f84f770ec4'

LLogan M

Uhhh that could be why 🤔 I'm not totally sure tbh

LLogan M

If the nodes were named before constructing the index, i would think that's fine

LLogan M

I really need to play around with these kinds of situations more

sshere

i figured it out

sshere

man that wasn't easy lol

sshere

so first load the storage context, load the index with the storage context, add the nodes to the index with insert(node)