The community member is looking to parse a payroll PDF and has tried using llamaparse, but it did not give good results. In the comments, one community member suggests that llamaparse should work fine, especially in premium mode. Another community member provides more suggestions, stating that for 'nice' PDFs, the Python OCR library pytesseract works okay, but for not-nice scans it can make mistakes. They also mention Google Cloud's DocumentAI as having good off-the-shelf document parsers, and the option to hand-label documents using the 'custom extractor', which can be pretty much perfect if the labeling is good. However, they note that navigating the GCP interface is annoying.
For 'nice' pdfs the python ocr library pytesseract works ok. For not-nice scans it can make mistakes. Google Cloud's DocumentAI has good off-the-shelf document parsers. There's also an option to hand-label documents (the 'custom extractor') and if your labeling is good, its pretty much perfect. However, its annoying to navigate the GCP interface, unfortunately