Find answers from the community

Updated 2 months ago

Logan M needed some guidance help

needed some guidance/help.
My use case is that I am given a question paper and for each question paper there's a corresponding marking scheme. I need to read the questions from the question paper pdf. The LLM shouldn't create it's own questions. Same for marking scheme. I feel it's a good use case for OpenAIPydantic program. What do you think?
L
r
8 comments
Yea that sounds about right. There's two part here -- getting the text off the PDF correctly, and then grading it with an LLM (likely a pydantic program)
How to achieve the "getting the text off the PDF correctly" part - same pydantic program right?
Or a normal query engine will do?
I think just a normal PDF loader will work? Or if it's not a true-digitial PDF, you may have to use OCR?
But a normal PDF loader wouldn't return "Question" objects right? It will just read the pdf text.
Right -- I'm assuming the PDF is somewhat formatted though, so hopefully it's easy to just parse/split the text?
Hmmm. I can work on making the PDF documents such.
Also, does llamaindex have a loader for tables and diagrams?
Hmm not really 🤔 You can use a package like camelot to try and get tables out of PDFs

For images, I think some PDF libraries can also spit out images
got it!
Thanks ❤️
Add a reply
Sign up and join the conversation on Discord