Find answers from the community

Updated 2 years ago

Image captioning

At a glance

The community member asked if llama-index has any multimodal implementations that can do image to text. Another community member responded that the simple directory reader can do some of that functionality, and provided a link to the image_caption_reader.py code. The original community member thanked the other and said this was exactly what they were looking for.

Useful resources

TTeemu

Does llama-index have any multimodal implementations that can do image to text?

2 comments

LLogan M

Yea, simple directory reader will do some of that under the hood for you.

Or you can use the image loader directly

The actual loader code is here

https://github.com/jerryjliu/llama_index/blob/main/llama_index/readers/file/image_caption_reader.py

TTeemu

Thanks! Exactly what I was looking for. Been using the image output thing but that's the exact model I wanted to use!

Add a reply