Find answers from the community

Updated 4 weeks ago

Image

does LI have a solution to managing the size of a page to send the oai image model?
to manage resolution vs. tokens?
L
t
7 comments
I think for resolution, you can just set low/high/auto for image details
https://github.com/run-llama/llama_index/blob/af9abd06a456a3745d02379f8afc4b6cab3a3f72/llama-index-integrations/multi_modal_llms/llama-index-multi-modal-llms-openai/llama_index/multi_modal_llms/openai/base.py#L60

I havent checked openais exact api to see if they have more controls than that recently
Will take a look. Glad to see you're in the room.

Thank you.
my pdf's are all scanned physial documents. so the pages are basically like photos of document pages.
they are a challenge to work with.
Typically the best approach we've seen is using llama parse (or something else) to ocr the page, and sending both the text and image to the llm

We have examples doing that 😁
Add a reply
Sign up and join the conversation on Discord