Find answers from the community

Updated 2 years ago

Does anyone know of a way to

Does anyone know of a way to automatically pass in a filename as doc_id when using SimpleDirectoryReader? Kapa AI believes there is a way to use use_filename_as_id=True but it is not working E.g.
Plain Text
# Load documents with filenames as document IDs
documents = SimpleDirectoryReader('path/to/your/data', use_filename_as_id=True).load_data()
L
e
5 comments
No way currently.

You could use the filename_fn to set the extra_info field of each document, and then iterate over the documents and manually change each doc_id to match it's filename

Although thats pretty annoying. I can make a PR to do this more automatically
@Logan M As always, thank you.
Should be available in v0.6.21! πŸ™‚
Add a reply
Sign up and join the conversation on Discord