Hi guys, I already asked the same issue on Github earlier, but I found no luck in there as did 🥲 finally I got here 🙂
After upgrading llama_index version 0.4.40 -> 0.5.2,
GPTSimpleVectorIndex errors out with log saying it can't find doc_id when index.update(document) called twice.
here's my code.
from llama_index import GPTSimpleVectorIndex, Document
index = GPTSimpleVectorIndex([])
document = Document(text="0", doc_id="example_doc_id")
index.insert(document)
document.text = "1"
index.update(document)
document.text = "2"
index.update(document)
Also when i call index.refresh([document]) twice, it shows the same error.
and the error response is like below
File /opt/conda/lib/python3.10/site-packages/llama_index/indices/vector_store/base.py:209, in GPTVectorStoreIndex._delete(self, doc_id, delete_kwargs) 207 def _delete(self, doc_id: str, delete_kwargs: Any) -> None:
208 """Delete a document."""
--> 209 self._index_struct.delete(doc_id)
210 self._vector_store.delete(doc_id)
File /opt/conda/lib/python3.10/site-packages/llama_index/data_structs/data_structs_v2.py:206, in IndexDict.delete(self, doc_id)
204 raise ValueError("doc_id not found in doc_id_dict")
205 for vector_id in self.doc_id_dict[doc_id]:
--> 206 del self.nodes_dict[vector_id]
KeyError: 'cd67fc18-7a95-4cdf-ad4a-1f5ef323d0fe'
I think some issues above are quite related to finding doc_ids like I did!