I dont know if you remember but i tried developing my own parser for .pdf data (to try to match at least some capabilities of llama-parse). During my research I have stumbled upon this nv-ingest. If I understand correctly the nv-ingest could then be parsed to nodes and stored.
Yeah, now I just need to sale the idea of 30k GPUs a piece to my boss hehe... Thanks for feedback! If you will be doing any implementation or other form of nvidia collab I would be glad to help.