Find answers from the community

Updated 5 months ago

Let's say you have a pdf with a variety

At a glance

Let's say you have a pdf with a variety of instructions, including pictures between each instruction. The parsers I'm using seem to only store text. Is there another setup that also support images? I'd like the answers it returns to include the screenshots.

3 comments

OOrion Pax

My sense here is that I need to preprocess the PDF.

Open the PDF,
extract image and save with a reference
replace image in text with link to image
Save new file for indexing.

LLogan M

Have you tried llama-parse? This is pretty much what it was made for

LLogan M

Alternatively, I know stuff like unstructured or marker will work if you need a local solution, but probably not as well

Add a reply