If you set
docstore_strategy
to
UPSERTS_AND_DELETE
, then everytime you run the pipeline, only updated and newly added nodes will be added to your chroma vector store, and old nodes that are not present in the updated documents will be deleted from both the docstore and vectorstore.
from llama_index.core.ingestion import DocstoreStrategy, IngestionPipeline
pipeline = IngestionPipeline(
transformations=transformations,
docstore=docstore,
vector_store=your_chroma_vector_store,
docstore_strategy=DocstoreStrategy.UPSERTS_AND_DELETE)