Find answers from the community

Updated 10 months ago

```py

Plain Text
reader = FlatReader()
node_parser = UnstructuredElementNodeParser()
docs = reader.load_data(Path("your_path.html"))
raw_nodes = node_parser.get_nodes_from_documents(docs)

why i got the BadZipFile: File is not a zip file error? Do you know how to solve this? thank you :D
W
L
N
24 comments
From where did you get reader = FlatReader() this reader?
its actually slightly hidden in llama-index, it just reads files as is with zero processing
@Nyse are your running this reader on zip files?
wait, it is alright using python 3.11 version?
because i used to run on colab, and it's works
(colab python version 3.10)
from llama_index.readers.file.flat_reader import FlatReader
from this one
after i change my python to 3.10 still the same
can you do that for me? this one was my html. This one is a public html
@WhiteFang_Jr @Logan M sorry for interrupt πŸ˜„
is it beacuse my vscode macbook?
finally i'm done
Let me try with one sample html. Can you give me the code that you are trying with
no, because i forgot download nltk perceptron
thank you very much
Ah okay, Its great that you solved it on your own
but i'm still confused
why do we need to download nltk perceptron
and what does the relation between badzipfile
This even i dont know. But if you are using local embed model they use nltk for their working
Add a reply
Sign up and join the conversation on Discord