Find answers from the community

Updated 4 weeks ago

Worked with web reader

Hi, do you think is a good approach to create a ingestion pipeline with documents from SimpleDirectoryReader and nodes from HTML files parsed with HTMLNodeParser?
G
1 comment
It work with web reader πŸ˜„
Plain Text
web:
  driver_arguments:
    - --no-sandbox
    - --disable-dev-shm-usage
    - --headless
  urls: 
    - prefix: "file:///app/data/web/confluence-export/Folder"
      base_url: "file:///app/data/web/confluence-export/Folder/index.html"
      max_depth: 10000
Add a reply
Sign up and join the conversation on Discord