The community member is asking whether the Llamaparse tool removes existing OCR from PDFs or if it augments its own OCR on top of the existing one. The comments suggest that Llamaparse does not remove existing OCR, but it may introduce new words or elements that are not present in the original PDF. One community member notes that they have experienced this issue, where Llamaparse has changed text in the PDF in a way that is not reflected in the original document. However, there is no explicitly marked answer to the original question.
When using llama parse it seems to introduce other words or elements into the pdf when I printed it out. I’m just curious if llama parse removes existing ocr and tries to do other things. For example, I have a program to ocr the docs and does a pretty good job. There will be a name like gia Allen for example. But then it will show something like gio Olsen. And of course if I search the original doc that’s fed in. It’s no where in the pdf. And I found other instances of this. So I’m just curious if under the hood there’s some manipulation to the pdf that’s done before processing. I do appreciate your responses so thanks so much!!