Find answers from the community

Updated 2 years ago

Hi Team as I new to Llama index please

At a glance
Hi Team, as I new to Llama index, please help in answering below question. I have a requirement of processing QnA operation on a bunch of documents which has both and text images (like flow charts, tables and some images with text inside it). How can we extract all these content and complete QnA part. Does Llama index supports here, if so any good reference for the same on how to achieve?
b
L
S
6 comments
You can check out https://llamahub.ai/ for data loaders to see if any of those might help!
The Flat-PDF loader does have some support for images, but it will either apply OCR or generate a caption using a local model

Bit of a "beta" feature at the moment
https://github.com/emptycrown/llama-hub/blob/main/llama_hub/file/flat_pdf/base.py
thanks, but all these will take input as image format. But, i have a document say pdf which has images and text. Please suggest which loader i need to consider(apart from flatpdf).
I think flatpdf is the only one supporting mixed content at the moment. Unstructured might also though 🤔

Tbh you might be better off rolling your own loader for cases like this too
I tried using Flatpdf, but it is not interpreting image in the pdfile. Is there any other way to interpret flowcharts and diagrams in a document.
Add a reply
Sign up and join the conversation on Discord