The community members are discussing ways to parse complex PDFs and extract tables, images, and text, potentially using large language models (LLMs) like GPT-4 or a locally deployed multimodal model. One community member suggests using a tool called "llama parse" for this task. Another community member asks if they can use their locally deployed model, and the response is that they can, but it might take longer. The discussion also touches on the cost of using a service like llama parse, especially for a large dataset of 1TB. The community members explore different approaches, such as sending PDF pages to a multimodal LLM and prompting it to extract the desired information. However, there is no explicitly marked answer in the provided information.
is there a wat to parse a complex pdf extracting tables, images and text, maybe even llm (gpt4o or maybe a local multimodel one)? maybe while reading files @Logan M