As a happy OpenRouter user I know the vast majority of the industry directly use vendor APIs and that the OpenRouter rankings are useless for those models.
I use codex high because Anthropic CC max plan started fucking people over who want to use opus. Sonnet kind of stinks on more complex problems that opus can crush, but they want to force sonnet usage and maybe they want to save costs.
Codex 5 high does a great job for the advanced use cases I throw at it and gives me generous usage.
Well that may also be because ChatGPT is worse than Gemini and Claude for coding. I don't know what the benchmarks say, I am just saying that from my own experience.
> ChatGPT is overwhelmingly, unambiguously, a "regular people" product.
How many of these people are paying and how much are they paying, though. Most "regular" people I met that have switched to ChaptGPT are using it as an alternative to search engines and are not paying for it (only one person I know is paying and he is using the Sora model to generate images for his business).
I really struggle to see a path where $.01 ad inventory covers the cost of inference, much less training or any other of OpenAI ventures. Unless every query makes you watch a 30 second unskippable video or something equally awful.
Users will ask ChatGPT for recommendations and the answer will feature products and services that have paid to be there, probably with some sort of attribution mechanism so OpenAI can get paid extra if the user ends up completing the purchase.
> I mean, yes, but also because it's not as good as Claude today.
I'm not sure, sometimes GPT-5 Codex (or even the regular GPT-5 with Medium/High reasoning) can do things Sonnet 4.5 would mess up (most recently, figuring out why some wrappers around PrimeVue DataTable components wouldn't let the paginator show up and work correctly; alongside other such debugging) and vice versa, sometimes Gemini 2.5 Pro is also pretty okay (especially when it comes to multilingual stuff), there's a lot of randomness/inconsistency/nuance there but most of the SOTA models are generally quite capable. I kinda thought GPT-5 wasn't very good a while ago but then used it a bunch more and my views of it improved.
Codex is great for fixing memory leaks systematically. Claude will just read the code and say “oh, it’s right here” then change something and claim it fixed it. It didn’t fix it and it doesn’t undo its useless change when you point out that it didn’t fix it.
Afraid not, a bit outside of my budget (given that I've been pushing millions of tokens daily, especially for lots of refactoring that'd be great to do in an automated fashion but codegen solutions for which... just don't exist). From what little I've used Opus in the past, I'm sure it'd do reasonably as well. Maybe even Sonnet with more attempts, different prompts etc.
Their tokens, they released a report a few months ago.
However, I can only imagine that OpenAI outputs the most intentionally produced tokens (i.e. the user intentionally went to the app/website) out of all the labs.
ChatGPT is overwhelmingly, unambiguously, a "regular people" product.