Find answers from the community

Updated 2 years ago

Image captioning

At a glance
The community member asked if llama-index has any multimodal implementations that can do image to text. Another community member responded that the simple directory reader can do some of that functionality, and provided a link to the image_caption_reader.py code. The original community member thanked the other and said this was exactly what they were looking for.
Useful resources
Does llama-index have any multimodal implementations that can do image to text?
L
T
2 comments
Yea, simple directory reader will do some of that under the hood for you.

Or you can use the image loader directly

The actual loader code is here

https://github.com/jerryjliu/llama_index/blob/main/llama_index/readers/file/image_caption_reader.py
Thanks! Exactly what I was looking for. Been using the image output thing but that's the exact model I wanted to use!
Add a reply
Sign up and join the conversation on Discord