Find answers from the community

Updated last year

hey all probably a super easy q, but how

hey all probably a super easy q, but how can I assign metadata to a each respective document?
L
c
d
21 comments
Just as a dictionary
document.metadata = {"key": "val"}
or Document(text=text, metadata=metadata)
thanks @Logan M so I have a directory full of documents
I need to assign seperate metadata to each
If you use SimpleDirectoryReader(), some metadata will be assigned automatically (including filename)

So you could post-process the output list of documents from there and use the filenames to add more metadata
ok so loop through the documents object and do document.metadata = {'foo' : 'bar'}
I do see this in the constructor file_metadata of SimpleDirectoryReader, anything of value thre?
That is a function that takes a filename and returns a dictionary
If you override it, then you also override the default metadata
Could be useful as well
Plain Text
def get_metadata(filename):
  return {"filename": filename}

document = SimpleDirectoryReader(..., file_metadata=get_metadata).load_data()
ah perfect thanks for the help!
also while I have you for a second. How can I get the LLM to return the associated metadata value, in my case the "url" key which I plan to add to the output of the LLM response?
the response object contains the nodes that were used to write the response

You can access these nodes to fetch the metadata

Plain Text
response = query_engine.query("..")
print(response.source_nodes[0].node.metadata)
If you want the LLM to actually write the URL in the response, you should be able to just include that in your query and it will figure it out I think
oh interested, I assume then the context the LLM gets the metadata in the prompt?
Yessir -- You can actually control this though
awesome thanks again for your help! you are invaluable to this community!
Hi! Just commenting here so that the thread gets saved for me. Thank you!
Add a reply
Sign up and join the conversation on Discord