Find answers from the community

Updated 3 months ago

I'm having issues when adding documents

I'm having issues when adding documents to an already built vectorstore so when the documents are added it looks like this:
{'page_label': '2213', 'file_name': 'pfsense'}
{'page_label': '2214', 'file_name': 'pfsense'}
{'page_label': '2215', 'file_name': 'pfsense'}
{'page_label': '2216', 'file_name': 'pfsense'}
{'page_label': '2217', 'file_name': 'pfsense'}
{'page_label': '2218', 'file_name': 'pfsense'}
{'page_label': '2219', 'file_name': 'pfsense'}
{'page_label': '2220', 'file_name': 'pfsense'}
{'page_label': '2221', 'file_name': 'pfsense'}
{'page_label': '2222', 'file_name': 'pfsense'}
{'page_label': '2223', 'file_name': 'pfsense'}
I want to get rid of the page_label because what its supposed to look like is this:
{'file_name': 'pfsense'}

this is the code:
L
B
12 comments
remove the page label from the metadata before inserting ?
@Logan M simple directory reader isn't iterable so it can't be removed i can show you the error:
Traceback (most recent call last):
File "/home/headquarters/Documents/Guardian/Github Commits/DevChrom.py", line 102, in <module>
for data in metadata:
TypeError: 'function' object is not iterable
metadata = lambda filename: {"file_name": input_directory}
for data in metadata:
data.excluded_llm_metadata_keys = ["page_label"]
documents = SimpleDirectoryReader(input_dir=input_dir,file_metadata=metadata)
docs = documents.load_data()
right, you cant iterate over a lambda
Iterate over the documents/metadata after it loads
I think i might understand that it's not reading either because the vectorstore is stored in nodes and not documents just noticed that. When I add documents its not in nodes its in doc text form so switching to nodes.
nvm I'll try it
@Logan M still having issues when i use excluded llm metadata keys its still processing the page label
What's the issue? What do you mean by processing? I don't fully understand
so the situation is when I add the documents using my add_data function its supposed to add the documents to the vectorstore. When I try adding the documents with a metadata id its not going into the database properly. Its not recognizing the data or its not recognizing the metadata because when queried the information doesn't exist
the code i sent shows the script behind it
@Logan M it might be an issue with my add function so im looking at documentation for it
Add a reply
Sign up and join the conversation on Discord