thanks, now if I am trying to update only if the content has actually changed, should I do a retrieval of the data and compare the page_text contents or does the VectorStore have some built in smarts to compare say the digest computed over the old and new page_text.
Second question, what is the size limitation on the index, another approach for a doc id might be to stash the url and md5 has together as the doc id and I am wondering if that might violate certain assumptions
Finally is there a call to retrieve a document from a VectorStoreIndex based on doc id. If there is one, I can't seem to find it on this page
https://gpt-index.readthedocs.io/en/latest/api_reference/indices/vector_store.html