Find answers from the community

Updated 8 months ago

how can i version my documents (notion

At a glance

The community member is asking how to version their documents (Notion, PDF, etc.) for a RAG pipeline, as updating the documentation would require re-vectorizing the entire data. The comments suggest two potential solutions:

1. Setting filename_as_id=True when reading the documents, which will use the filename as a unique document ID. When a file is updated, the community member can remove all the nodes for that file ID and insert the updated file.

2. An automated way using the LlamaIndex library, which compares hashes and nodes, and only updates the documents where there is a match.

There is no explicitly marked answer, but the community members have provided suggestions to address the versioning issue.

Useful resources
how can i version my documents (notion / pdf etc. ) for RAG pipeline. lets say if there is any update in the documentation then i will have to vectorize complete data again
W
p
4 comments
I think you can add filename_as_id=True while reading the docs.

This will add the unique doc ID as your filename. When any existing file gets updated you can remove all the nodes for that file ID and insert the updated file.
there is no automated way to it currently
It comapres hashes and nodes and update only those where it gets a match
Add a reply
Sign up and join the conversation on Discord