Rach

·

Anyone have a positive experience using

Anyone have a positive experience using the Microsoft SharePoint Reader from Llama Hub as a reader/loader? I'm also curious if anyone has been able to stack the the SharePoint reader with other, file-type specific loaders? I need to pull files from Sharepoint, but I want to the individual file types to load in the best way possible for future parsing.

1 comment

W

RRach

·

Hi everyone, I'm trying to load

Hi everyone, I'm trying to load different types of source files using different readers. I just got an error for the HTMLTagReader

Failed to load file NAME with error: HTMLTagReader.load_data() missing 1 required positional argument: 'file'. Skipping...

and now I'm second guessing my function:

def document_loader(docs_relative_path):
    # Define custom readers
    ##Readers found in https://llamahub.ai/?tab=readers
    class MyHTMLTagReader(HTMLTagReader):
        pass

    class MyJSONReader(JSONReader):
        pass

    class MyPPTReader(PptxReader):
        pass

    class MyXMLReader(XMLReader):
        pass

    #Currently just for .pdf
    ##LlamaCloud account
    parser = LlamaParse(
        api_key="",
        result_type="text",
        verbose=True,
    )

    # Create custom file extractors dictionary
    file_extractors = {
        ".html": MyHTMLTagReader,
        ".json": MyJSONReader,
        ".pdf": parser,
        ".pptx, .ppt": MyPPTReader,
        ".xml": MyXMLReader,
    }

    # Initialize SimpleDirectoryReader with custom file extractors
    ## SimpleDirectoryReader reads any files it finds, treating them all as text. It explicity supports:.csv, .docx, .epub, .hwp, .ipynb, .jpeg, .jpg, .mbox, .md, .mp3, .mp4, .pdf, .png, .ppt, .pptm, .pptx
    reader = SimpleDirectoryReader(input_dir=docs_relative_path, file_extractor=file_extractors, filename_as_id=False)

    # Load documents
    documents = reader.load_data()
    print("Number of documents loaded:", len(documents))

    # Do further processing with loaded documents
    return documents

Any tips?

2 comments

R

L

Find answers from the community

Anyone have a positive experience using

Hi everyone, I'm trying to load