Bad news. I gave o3 two full days of being my go-to coding LM, and ...
Today, I spent an entire day using o3 and o4-mini-high to translate my book from English to French, and the result is awful.
Just like o1 and o3-mini-high in the past, the updated ChatGPT models don't follow the instructions, including a total disrespect for the given glossary and disregard for the original LaTeX and Markdown annotations in the input text.
A deplorable picture for OpenAI
People saying good things about o3 and o4-mini likely fall for more elaborate output formatting with tables and emojis. That's not why I use LMs.

Out of curiosity - was Grok able to translate your book?
As a PHD in AI, you know different LLMs are meant for different things
The new OpenAI 4.1 is good at coding, not sure why you would use those other models. Besides that, Sonnet is the best at code.

I've tested o3 deep research on cybersecurity topics. It seems to do well where there is existing reference content it can analyze. However, any expectation of drawing conclusions from reasoned analysis not already available in the source material is nowhere close to “intelligent.”
Is there a difference between performance in Quebec French vs Metropolitan?
I just translated the book Limitless from English to German with Gemma 3 12B and it is flawless lol. But I had to split each 4k-6k tokens as it cannot output as much locally with q4.
Yes that has been my gripe too. But remarkably Claude 3.7 is also May be not just as bad but not much better. Still I feel codegen wise anthropic is a bit ahead of grok, Gemini or OpenAI.

It’s interesting, but it seems to me that a large percentage of the general public’s qualitative analysis of LLMs seems to be primarily based not on the accuracy of its output or the usefulness of its tools, but instead on the perceived humanity of its language style.
Gemini Pro is really good. And look also to a specialized model like Qwen-2.5-coder.
Join the Discussion
To view or add a comment, sign in.
Create your free account or sign in to continue your search.
New to LinkedIn? Join now.