Find answers from the community

Home
Members
diridiri
d
diridiri
Offline, last seen 3 months ago
Joined September 25, 2024
I guess also, update method is still not functioning as expected, not deleting the original document, it just adds new document.

here's simple test code to reproduce index update related error.

Plain Text
from llama_index import GPTSimpleVectorIndex, Document

document1 = Document(text="11", doc_id="original_doc_id")
index = GPTSimpleVectorIndex.from_documents([document1])

print (index.docstore)
document1.text = "asdf"
index.update(document1)
print ("----------- after doc1 update ----------")
print (index.docstore)


this shows two documents created in index after updating document

Need your superpower logan! 🥲
2 comments
d
L
Hi guys, I already asked the same issue on Github earlier, but I found no luck in there as did 🥲 finally I got here 🙂

After upgrading llama_index version 0.4.40 -> 0.5.2,
GPTSimpleVectorIndex errors out with log saying it can't find doc_id when index.update(document) called twice.

here's my code.

from llama_index import GPTSimpleVectorIndex, Document

index = GPTSimpleVectorIndex([])
document = Document(text="0", doc_id="example_doc_id")
index.insert(document)
document.text = "1"
index.update(document)
document.text = "2"
index.update(document)

Also when i call index.refresh([document]) twice, it shows the same error.

and the error response is like below

File /opt/conda/lib/python3.10/site-packages/llama_index/indices/vector_store/base.py:209, in GPTVectorStoreIndex._delete(self, doc_id, delete_kwargs) 207 def _delete(self, doc_id: str, delete_kwargs: Any) -> None:
208 """Delete a document."""
--> 209 self._index_struct.delete(doc_id)
210 self._vector_store.delete(doc_id)

File /opt/conda/lib/python3.10/site-packages/llama_index/data_structs/data_structs_v2.py:206, in IndexDict.delete(self, doc_id)
204 raise ValueError("doc_id not found in doc_id_dict")
205 for vector_id in self.doc_id_dict[doc_id]:
--> 206 del self.nodes_dict[vector_id]

KeyError: 'cd67fc18-7a95-4cdf-ad4a-1f5ef323d0fe'

I think some issues above are quite related to finding doc_ids like I did!
2 comments
d
L