Find answers from the community

Updated 3 months ago

HTML Files for menus

Any help?
b
L
16 comments
do you have any error?
It just gives me "None"
I think it's because I may be not using an unstructed reader I think?
Are you seeing docuemnts in documents list?
im not, hold on, so I may have found a solution to just having llama index using a web reader to get all of the menu websites because my college setup their menus like this
with horrible htmls per week and its stupid
However, if I pass all of the urls into like this
your solution should work
but you'll probably want to use a CodeSplitter for the text splitting on the nodes
Plain Text
text_splitter = CodeSplitter(
      language="html",
      chunk_lines=120,
      chunk_lines_overlap=10,
      max_chars=1000,
    )

    html_documents = [Document(text=html, metadata={url:url})]
    node_parser = SimpleNodeParser.from_defaults(text_splitter=text_splitter)
I'll make sure to try that, I'm interested to see if the simple web loader works because im lazy!
Add a reply
Sign up and join the conversation on Discord