Find answers from the community

Updated 3 months ago

setting custom doc_id for database documents

Hi - I'm trying to set a custom "doc_id" for each record from my database load, so that I can easily identify rows and hopefully not duplicate them later.

any idea why the "doc_id"s are still coming out as the long default ids?
Plain Text
documents = db.load_data(query = query)

for document in documents:
    # split the text by comma and take the first value
    first_value = document.get_text().split(',')[0]
    print(first_value) #this does work
    # assign the first_value as the doc_id of the document
    document.id_ = first_value

print(documents)
B
L
4 comments
(i put my desired doc_id as the first column of my database load)
I think you read the docs for this right? The docs assume you have the latest version, and there was quite a big change under the hood recently for documents and nodes

But judging from your prints, you have an older version of llama index

In your version, you can set the doc_id using document.doc_id = first_value
upgrading fixed it! not used to daily updating software yet 😁 - love the speed of development here. thank you!
Add a reply
Sign up and join the conversation on Discord