Find answers from the community

Updated 2 months ago

VO.10

is the Pandas Excel Loader no longer supported ? and is there an undergoing update on llama hub ?
W
g
19 comments
With v0.10.x
Llama-hub is considered deprecated.

All the loaders/tools have been moved into llama-indec repo.

Also are you talking about pandas CSV reader? Didn't find anything related to excel reader
i found it in this question and was wondering how to load complexe Excel files, if you have some insights about which loaders to use, or any recommendations, please don't hesitate
https://github.com/run-llama/llama_index/issues/9204
is there also examples on how to use specific loaders, for example a Docx file reader or something similar, I used to get the examples from llama hub idk where to find them now
DocxReader requires a path object for file argument. Since you are passing a string that is why youa re getting the error.
Attachment
image.png
For this, Logan has already mentioned the solution. You can try this:https://github.com/run-llama/llama_index/issues/9204#issuecomment-1832046633
can you tell me how to specify the reader that SimpleDirectoryReader uses by default, i'm trying to change the reader of the pdf to use llama parse and have docx read by unstructured, It's not mentioned what the default readers are, thanks
Plain Text
from llama_parse import LlamaParse
from llama_index.core import SimpleDirectoryReader

parser = LlamaParse(
    api_key="llx-...",  # can also be set in your env as LLAMA_CLOUD_API_KEY
    result_type="markdown",  # "markdown" and "text" are available
    verbose=True
)

file_extractor = {".pdf": parser}
documents = SimpleDirectoryReader("./data", file_extractor=file_extractor).load_data()



Same you can replace for docx
Plain Text
files = Path("./DB_Useful/tests")
file_extractor = {".pdf": parser, ".docx":UnstructuredReader,".jpg":UnstructuredReader,".csv": UnstructuredReader}
documents = SimpleDirectoryReader(files, file_extractor=file_extractor, recursive=True).load_data()
I'm using this code but I get this error since Unstructered take a file argument
Failed to load file \tests\csv\file.csv with error: UnstructuredReader.load_data() missing 1 required positional argument: 'file'. Skipping...
Do you have any infor about the default reader used by the SimpleDirectoryReader ? in the docs it's mentioned that it supports also Docx and I just want to knpow if using the default one is better than using Unstructured
default library for docs is docx2txt
Also you'll have to pass in like this:
file_extractor = {".pdf": parser,".docx":UnstructuredReader()}
thanks, but it gives me another error this time
even tho i have them installed both, unstructured and unstructured[docx]
try restarting the session
Yep that did it, thank you soooo muuchhhh ;))))
Add a reply
Sign up and join the conversation on Discord