Find answers from the community

s
F
Y
a
P
Updated 2 years ago

PDFReader should be able to parse S3

PDFReader should be able to parse S3 URLs right? I am 100% sure I had this working a few days back
s
j
6 comments
I am seeing
Plain Text
FileNotFoundError: [Errno 2] No such file or directory: 'https:/mybucket.s3.us-east-1.amazonaws.com/s3--fb3cf9f4addf/bitcoin.pdf'

but when I go to that url in my browser, i can see the pdf (fully publicly accessible)
from:
Plain Text
        PDFReader = download_loader("PDFReader")
        loader = PDFReader()
        documents = loader.load_data(file=Path(chatbotUrl));
@smokeoX hm the pdfreader reads local files, was this working for you before? you could try our s3 reader (which would call the pdfreader under the hood if it's a pdf file)
kk, will try S3 reader, I remember having a different set of issues with that but prob makes sense to use the right tool for it! πŸ˜„
thanks @jerryjliu0
hmm with that I am getting:
Plain Text
FileNotFoundError: [Errno 2] No such file or directory: '/Users/sim/.pyenv/versions/3.9.2/lib/python3.9/site-packages/llama_index/readers/llamahub_modules/file/base.py'

but i think this is related to packages/environment?
Add a reply
Sign up and join the conversation on Discord