doc_id
as a field in the metadata for LlamaIndex document and that will help dedupe but how will that work if document is say 1000 pages i.e. the document is broken into multiple nodes. How does doc_id
translate to multiple node
identifier to identify which node to update in MongoDB docstore and Qdrant vector store?node_id
directly then any guidance into how to generate node_id will be super helpful.vector_store.delete(doc_id)
to delete all nodes that had that doc id listed as a sourcedoc_id
each time (maybe just adding v1/v2/v3 to it?)node_ids
and re-add them. This is tougher with qdrant because it needs UUID for node idsnodes = vector_store.get_nodes(node_ids=[...]) # do something to change them? vector_store.add(nodes)
id_func
is that something that will be helpful?