Community members are discussing their experiences switching from GPT-3.5 to GPT-4o mini. They have found that structured content often fails to return values in the correct format, and the speed is slower. However, the pricing for GPT-4o mini is better. Some community members have observed similar issues, and are disappointed with the step back in performance. They have explored options like JSON mode and function calling, but these do not seem to fully resolve the problems. One community member has been switching to Anthropic, which they find to be more reliable, though a bit slower and requiring new prompts.
Does anyone have any findings around switching from gpt 3.5 to gpt 4o mini? I'm finding that structured content is quite a bit worse for gpt 4o mini vs 3.5, often failing to return the values in the correct format. Also the speed is quite a bit slower... But I feel we have to switch as the pricing is so much better...
Forcing the json (either with json mode or function calling) won't help as much either. It can still write bad tool inputs, or hallucinate tool names.
There is a .parse endpoint that only works on very select models that we haven't integrated yet, but my understanding is this is the same as function calling with tool_choice set (which we already do)
hmm okay. Quite dissapointed with OAI in this one. We're also just seeing it follow instructions worse in general. For example we prompt to answer in a certain language and it sometimes ignores this and uses the language of the content we provide. Very annoying...
I've been slowly switching to anthropic for a lot more new apps lately. Its a bit slower (probably similar to 4o-mini), and requires new prompts (you NEED to prompt/parse with XML imo), but it seems very reliable so far