Find answers from the community

Updated 3 months ago

Update index

Hi guys, I already asked the same issue on Github earlier, but I found no luck in there as did 🥲 finally I got here 🙂

After upgrading llama_index version 0.4.40 -> 0.5.2,
GPTSimpleVectorIndex errors out with log saying it can't find doc_id when index.update(document) called twice.

here's my code.

from llama_index import GPTSimpleVectorIndex, Document

index = GPTSimpleVectorIndex([])
document = Document(text="0", doc_id="example_doc_id")
index.insert(document)
document.text = "1"
index.update(document)
document.text = "2"
index.update(document)

Also when i call index.refresh([document]) twice, it shows the same error.

and the error response is like below

File /opt/conda/lib/python3.10/site-packages/llama_index/indices/vector_store/base.py:209, in GPTVectorStoreIndex._delete(self, doc_id, delete_kwargs) 207 def _delete(self, doc_id: str, delete_kwargs: Any) -> None:
208 """Delete a document."""
--> 209 self._index_struct.delete(doc_id)
210 self._vector_store.delete(doc_id)

File /opt/conda/lib/python3.10/site-packages/llama_index/data_structs/data_structs_v2.py:206, in IndexDict.delete(self, doc_id)
204 raise ValueError("doc_id not found in doc_id_dict")
205 for vector_id in self.doc_id_dict[doc_id]:
--> 206 del self.nodes_dict[vector_id]

KeyError: 'cd67fc18-7a95-4cdf-ad4a-1f5ef323d0fe'

I think some issues above are quite related to finding doc_ids like I did!
L
d
2 comments
Hey! From the traceback, I think there is a bug with the delete function (which is used during update/refresh)

Another user pointed out yesterday that delete wasn't actually deleting.

I'll have a look at this at some point today 👍
Thanks for your great support @Logan M! bb
Add a reply
Sign up and join the conversation on Discord