Log in
Log into community
Find answers from the community
View all posts
Related posts
Did this answer your question?
π
π
π
Powered by
Hall
Inactive
Updated 2 months ago
0
Follow
I think there is a bug in
I think there is a bug in
Inactive
0
Follow
s
strike
9 months ago
Β·
I think there is a bug in
SimpleDirectoryReader
or the Documentation needs to be updated.
I am unable to parse/index
.ppt
but the
.pptx
's are just fine.
Anyone can help/confirm if what needs to be done.
Thanks,
L
s
p
24 comments
Share
Open in Discord
L
Logan M
9 months ago
Is there an error? Its just using the PptxReader
Attachment
L
Logan M
9 months ago
Its using
from pptx import Presentation
to parse, maybe that package is having trouble reading your file
L
Logan M
9 months ago
I might recommend trying llama-parse if you haven't already
s
strike
9 months ago
It is using pptx reader and just parses
.pptx
and not
.ppt
.
s
strike
9 months ago
Inside pptx module
Attachment
s
strike
9 months ago
@Logan M llama-parse is limited to pdfs as of now; at least the web version. That I interacted with.
Any Suggestions?
s
strike
9 months ago
PptxReader parses/processes only
.pptx
file type not
.ppt
L
Logan M
9 months ago
llama-parse was recently updated to handle 50+ file types
L
Logan M
9 months ago
the
.ppt
extension is mapped to the
PptxReader
-- I guess this is an error?
s
strike
9 months ago
yes
s
strike
9 months ago
Is this update in API or web_application?
L
Logan M
9 months ago
I think just the API --
pip install -U llama-parse
L
Logan M
9 months ago
https://github.com/run-llama/llama_parse/blob/main/examples/other_files/demo_ppt_basic.ipynb
s
strike
9 months ago
awesome; will check it out - thanks
s
strike
9 months ago
Hey; tried the API here's the error:
Attachment
L
Logan M
9 months ago
awe man, I thought it was more robust than that. Maybe it wasn't fully rolled out...
L
Logan M
9 months ago
@pld just curious, we have plans to expand that list right?
s
strike
9 months ago
is there any branch that has that piece that handles .ppt?
p
pld
9 months ago
yes, we should extend it.
p
pld
9 months ago
https://github.com/run-llama/llama_parse/pull/110
π
s
strike
9 months ago
@Logan M Are there any plans to bring this extended support to llama-index readers?
L
Logan M
9 months ago
No plans currently, but I welcome a PR to add any new reader
s
strike
9 months ago
And how is llama-parse handling the extended format(s)?
is there any code I can look into?
L
Logan M
9 months ago
Currently its being provided as a SASS service, with options for on-prem VPC deployments in the future. (So no source code for now)
Add a reply
Sign up and join the conversation on Discord
Join on Discord