Find answers from the community

Updated 6 months ago

Hi

At a glance

The community members are discussing whether it is possible to have llamaIndex load PDFs directly from an S3 bucket without first downloading the bucket contents locally. The initial response suggests that it may not be possible, but that a custom loader could be written. Another community member mentions that the GitHub repo loader might be able to do something similar, but they couldn't figure it out. One community member tried using a loader from llamahub, but encountered an error about the resource being permanently moved. After some troubleshooting, they were able to resolve the issue by using the "prefix" parameter instead of "key" and specifying the S3 endpoint URL.

Useful resources
Hi
Is it inately possible to have llamaIndex load pdfs from an s3 bucket without we having to download the bucket contents locally then reading from there?
L
T
7 comments
hmmm I don't think its possible? Youd definitely have to write your own loader though (which tbh is not too hard or scary)
I feel like the github repo loader does something like that, but I can't figure it out at a glance
I tried one of the loaders on llamahub
https://llamahub.ai/l/readers/llama-index-readers-s3?from=readers

but i think I run into an error that said the resource has been permanently moved
let me find the exact error message
Attachment
image.png
google tells me this is an issue with your region name or other credentials? πŸ‘€
hmm weird, i used same credentials to download files through the CLI though πŸ’€
ah resolved it XD
went through the class definition and found out I was supposed to use "prefix" instead of "key", and had to specify the s3_endpoint_url πŸ˜†
Add a reply
Sign up and join the conversation on Discord