Log in
Log into community
Find answers from the community
View all posts
Related posts
Did this answer your question?
π
π
π
Powered by
Hall
Inactive
Updated 3 months ago
0
Follow
what has been your best document parser
what has been your best document parser
Inactive
0
Follow
R
Rainher
9 months ago
Β·
what has been your best document parser, especially for PDF with tables? So far it looks like only llmsherpa and llamaparse (which I can't use) did a good job with tables
E
R
v
9 comments
Share
Open in Discord
E
Emanuel Ferreira
9 months ago
can you tell us why you can't use llamaparse?
R
Rainher
9 months ago
privacy issues, documents can't leave the network (these are actual legal proceedings that might still be active)
R
Rainher
9 months ago
we would be super happy to have an on-prem option, given that it got me the best results, but for now it's a no go
R
Rainher
9 months ago
Unstructured, on the other hand, is not really very good
E
Emanuel Ferreira
9 months ago
through the platform contact form you can contact us to the on-prem option
E
Emanuel Ferreira
9 months ago
also the best alternative to llamaparse is likely pyMuPDF
v
verdverm
9 months ago
(slight tangent)
Is pyMuPDF better than camelot at tables? Or if we are dealing with tables specifically, do you think camelot is the way to go?
v
verdverm
9 months ago
Does LlamaParse have the features in pyMuPDF to fix corrupted PDFs? That's intreguing and helpful. Would you use them in concert?
R
Rainher
9 months ago
I think I will give PyMuPDF a quick go
Add a reply
Sign up and join the conversation on Discord
Join on Discord