Find answers from the community

Updated 3 weeks ago

Vision

What is the best vision model currently that performs as well as gpt4o? That is an LLM with vision modality.
L
t
7 comments
Like open source? Probably llama3.1/2 or whatever the multimodal one is. Although it will be no where close to openai lol

Closed source, sonnet and gemini are very good
Hi!

Either as long as they can run locally.
What about qwen
Haven't heard much about qwen's multimodal capability tbh

I wouldn't trust academic benchmarks too much. I feel like most models these days are overfitting to the benchmarks out there
oh, really? That's interesting.
with OAI vision for a 8000 character document (with lot's of numbers), I am getting 1 character wrong
i'd like a similar model but running locally
Add a reply
Sign up and join the conversation on Discord