Find answers from the community

Updated last year

Is there a loader that uses OCR as

At a glance
Is there a loader that uses OCR as default to scan pdfs and includes metadata like page numbers and filename?
1
b
v
L
16 comments
reader = PDFNougatOCR()
pdf_path = Path(directory_path + 'my_pdf.pdf')
documents = reader.load_data(pdf_path)
gets the error: ERROR:root:An error occurred while processing the PDF: [Errno 2] No such file or directory: 'nougat'
am i supposed to just only have a single string with the full path? or is this some kind of issue with the import
yeah whats your directory path? seems like it doesn't exist
i was missing a requirement which solved that error
but wow this takes forever to run on 1 pdf if you are using CPU
yes need gpu!
its pretty good tho
any suggestions for MAC ppl?
most OCR models that are worth it are going to use GPU πŸ˜…

Although I don't think theres a loader for tesseract, so you'd have to create the document objects yourself (it's really not that hard though πŸ™ )
@velocitybolt @Logan M use colab? is there a way to make nougat use mac's gpu's (m2/m1)?
I think logan once told me model.to("mps:0")
nougat tries to use mps automatically under the hood, but I couldn't get it to work on my M2 πŸ€” (its all in some external library)

colab is a good option
hello, do you have any code to help me with, i am using colab and this is not really working for me from llama_index import PDFNougatOCR

Initialize the PDFNougatOCR reader

pdf_reader = PDFNougatOCR()
as i want to convert the non modifiable pdf to modifiable onces to use them in my RAG
Add a reply
Sign up and join the conversation on Discord