Hey when using unstructured can you help me decipher what format the data needs to be in to use llama index. I am looking to extend the partition_html to the auto partition. it looks like i will need to write a extract_elements since that one is hard coded for html.