Log in
Log into community
Find answers from the community
View all posts
Related posts
Did this answer your question?
😞
😐
😃
Powered by
Hall
Inactive
Updated 2 months ago
0
Follow
jerryjliu98 9313 What multimodal
jerryjliu98 9313 What multimodal
Inactive
0
Follow
a
anilmatcha
2 years ago
·
What multimodal capabilities LlamaIndex has
j
a
L
7 comments
Share
Open in Discord
j
jerryjliu0
2 years ago
we just released some today! you can ingest image Documents as well as text Documents
j
jerryjliu0
2 years ago
will expand this abstraction once more details of gpt-4's hybrid image/text api is released
j
jerryjliu0
2 years ago
https://github.com/jerryjliu/llama_index/blob/main/examples/multimodal/Multimodal.ipynb
a
anilmatcha
2 years ago
@jerryjliu0 Thanks for sharing, is this done with ocr in behind?
j
jerryjliu0
2 years ago
yep! Currently it is
L
Leon_G
2 years ago
@jerryjliu0 What are you using for OCR?
j
jerryjliu0
2 years ago
It’s either pytesseract or the DONUT model
Add a reply
Sign up and join the conversation on Discord
Join on Discord