Log in
Log into community
Find answers from the community
View all posts
Related posts
Did this answer your question?
π
π
π
Powered by
Hall
Active
Updated 3 weeks ago
0
Follow
Vision
Vision
Active
0
Follow
t
tarpus
3 weeks ago
Β·
What is the best vision model currently that performs as well as gpt4o? That is an LLM with vision modality.
L
t
7 comments
Share
Open in Discord
L
Logan M
3 weeks ago
Like open source? Probably llama3.1/2 or whatever the multimodal one is. Although it will be no where close to openai lol
Closed source, sonnet and gemini are very good
t
tarpus
3 weeks ago
Hi!
Either as long as they can run locally.
t
tarpus
3 weeks ago
What about qwen
L
Logan M
3 weeks ago
Haven't heard much about qwen's multimodal capability tbh
I wouldn't trust academic benchmarks too much. I feel like most models these days are overfitting to the benchmarks out there
t
tarpus
3 weeks ago
oh, really? That's interesting.
t
tarpus
3 weeks ago
with OAI vision for a 8000 character document (with lot's of numbers), I am getting 1 character wrong
t
tarpus
3 weeks ago
i'd like a similar model but running locally
Add a reply
Sign up and join the conversation on Discord
Join on Discord