Find answers from the community

n
nwdan
Offline, last seen 3 months ago
Joined September 25, 2024
n
nwdan
·

Refresh

I am trying to insert a doc in using refresh i.e. if the doc has been inserted previously don't try again, here's my snippet of code and it isn't working. Every run with the same page and same url resutls in a fresh insertion.

index = VectorStoreIndex.from_documents([])
index.storage_context.persist(storage_location)

def add_page_to_index(page_text, page_url):
document = Document(url=page_url, text=page_text)
documents = [document]
retval = index.refresh(documents)
index.storage_context.persist(storage_location)
return
10 comments
L
n
index.refresh_ref_docs(documents) using the VectorStoreIndex thows this error sporadically "KeyError: '6dd96c06-9775-47c6-ab65-a30f235a3a7f'" . stack trace below. This is with Llama-index 0.7.15

File "/crawler.py", line 37, in add_page_to_index
retval = index.refresh_ref_docs(documents)
File "/env/lib/python3.10/site-packages/llama_index/indices/base.py", line 313, in
refresh_ref_docs
self.update_ref_doc(
File "/env/lib/python3.10/site-packages/llama_index/indices/base.py", line 277, in update_ref_doc
self.delete_ref_doc(
File "/env/lib/python3.10/site-packages/llama_index/indices/vector_store/base.py", line 288, in delete_ref_doc
self._index_struct.delete(node_id)
File "/env/lib/python3.10/site-packages/llama_index/data_structs/data_structs.py", line 198, in delete
del self.nodes_dict[doc_id]
KeyError: '6dd96c06-9775-47c6-ab65-a30f235a3a7f'
2 comments
n
L