The community member is looking for resources to build a parser similar to Llama-Parse, but they cannot use cloud providers due to data safety concerns and must rely on a local LLM and embedding model. The community members suggest checking out Unstructured, an open-source option, but note that it may not match the capabilities of Llama-Parse. They also mention the UnstructuredElementNodeParser as a potential solution, and one community member thinks the normal Unstructured reader/loader might be what the original poster is looking for.
Are there any good resources for building a parser similar to Llama-Parse? My issue is that, due to data safety, we cannot use any cloud providers and must rely on a local LLM and embedding model. I'm watching Jerry Liu's Ray Summit 2024 video and am considering how I might replicate at least some of the basics. Cheers!