More

tacone · 2026-06-23T15:46:19 1782229579

My take is that Anthropic and OpenAI simply are NOT competing on price. 2 big players are often not enough to create tension on price.

Chinese models and open model providers are, indeed, competing on price, and the difference shows.

gizmo686 · 2026-06-23T16:31:35 1782232295

1 player is enough to create tension on price when "don't buy it at all" is a comptetative option. By most accounts, Anthropic and OpenAI both lose to "just don't buy" when they try charging at cost.

bluGill · 2026-06-24T00:09:43 1782259783

Nobody has seen the financials who is talking. We have various rumors of costs but no real reason to believe any of them.

rhinoceraptor · 2026-06-23T16:05:52 1782230752

How are Anthropic and OpenAI going to compete on price when they're both already deeply unprofitable?

solidasparagus · 2026-06-23T16:36:48 1782232608

Serving the API is profitable. They are unprofitable because of R&D (and maybe subscription costs?). If they can continue to find access to R&D capital, there is space to reduce API costs.

dns_snek · 2026-06-23T17:16:14 1782234974

Nuclear energy is really cheap too... as long as you ignore CapEx, would you like to invest?

HDThoreaun · 2026-06-23T20:02:18 1782244938

Marginal cost of nuclear is huge. Marginal cost of inference is much smaller. Capex in nuclear isn’t a fixed cost, it is the marginal cost.

dns_snek · 2026-06-24T12:15:33 1782303333

The marginal cost of nuclear energy is 14-20% of the total cost according to pages 39,40 of [1].

The point I'm making is that claiming that AI labs would be profitable if only they could stop spending money on the only thing that makes them valuable is absurd. Frontier models are like a nuclear power plant that needs to be rebuilt from scratch every 24 months.

Let's say that they paused R&D a year ago. It's June 2026, OpenAI's latest offering is GPT 4.1, Codex is still just a private beta that hasn't been updated in months. How much revenue do you think they would be making right now? My guess is approximately zero.

[1] https://www.lazard.com/media/5tlbhyla/lazards-lcoeplus-june-...

HDThoreaun · 2026-06-24T16:34:10 1782318850

The labs don’t have to stop r and d to become profitable. They just need more customers. This strategy doesn’t work in nuclear. Building a nuclear plant doesn’t mean you can scale it up to serve the whole world. Building an ai model does.

dominotw · 2026-06-23T17:02:42 1782234162

how do you have access to their financials? are you an insider?

Edit: to the commenter below . It was widely reported that these companies were unprofitable 1 from last year. I am asking question to this specefic comment because they made a very specific claim about part of plan thats profitable . something only an insider would know.

1. https://www.wsj.com/tech/ai/openai-anthropic-profitability-e...

mh- · 2026-06-23T17:12:22 1782234742

I'm curious why you didn't pose this question to the grandparent commenter, who first asserted the opposite?

zyuiop · 2026-06-23T18:45:41 1782240341

The amount of capital they need to raise, despite the claimed revenue, indicates that they spend more than they gain, which is by definition unprofitable.

brainwad · 2026-06-23T17:03:31 1782234211

Anthropic just announced it's on track to have its first profitable quarter: https://www.wsj.com/tech/ai/mind-blowing-growth-is-about-to-...

dns_snek · 2026-06-23T17:19:05 1782235145

Response: https://www.wheresyoured.at/anthropics-profitability-swindle...

SpicyLemonZest · 2026-06-23T16:20:20 1782231620

They may not be able to! It's pretty widely acknowledged, for example, that if there's some surprising plateau hiding around the corner they're both going to fail. But that could mean that they're overcharging for AI usage to get research money and sustainable rates are lower rather than higher.

guax · 2026-06-23T17:23:50 1782235430

I think that for coding we're past the plateau issue. The frontier models of today are good enough and very valuable. The expensiveness in running them will eventually be solved by cheaper faster hardware.

I do hope that a day will come where you can buy the nvidia spark thingy for 5k that can run the equivalent of Opus 4.6 or 4.5 locally and that would be a massive thing.

johnvanommen · 2026-06-23T17:34:56 1782236096

> The expensiveness in running them will eventually be solved by cheaper faster hardware.

How?

* Moores Law is almost over. The 5090 improves over the 4090 mostly because of quant improvements.

* even if the hardware improves, there’s a huge incentive to slow roll the next generation. Nobody wants to end up like Sun Microsystems. Sun’s used hardware was faster than its new hardware, once you considered price. Sun ended up competing with its own used equipment.

The most obvious place for improvement is RAM, network and storage.

If someone can bring more RAM onto the market, that will unstick things.

Legend2440 · 2026-06-23T17:54:53 1782237293

GPUs are not really the ideal architecture for running neural networks; they are heavily bottlenecked by memory bandwidth and struggle to keep all their tensor cores supplied with data.

There is significant room to make more specialized neural network accelerators with new compute-in-memory architectures.

If the brain can run 86 billion neurons on 30W it must be possible.

akomtu · 2026-06-23T21:57:37 1782251857

Our brains run 86 billion neurons the same way a waterfall runs a fluid simulation with N quadrillion particles.

VorpalWay · 2026-06-23T22:56:02 1782255362

There are already some companies doing specialised inference hardware, Cerebras Systems for example. Such designs are still early days and I wouldn't be surprised to see more innovation there. Though because custom silicon design takes time I expect a multi-year cycle.

For training, not sure. But even if training runs on GPUs, once you have the model the main cost is inference.

CuriouslyC · 2026-06-23T16:28:02 1782232082

The whole hidden plateau hypothesis is kinda bunk, because we're already pretty far in a plateau for general knowledge/question answering, but there are many subdomains where we can push model capabilities, and as we saturate one subdomain we can just shift to another economically valuable one.

There isn't one AI intelligence S curve, there are thousands of them, and they're mostly invisible in the major benchmarks, but for someone trying to do work in that specific area of capability, the progress is transformative.

SpicyLemonZest · 2026-06-23T16:51:23 1782233483

I'm skeptical of a hidden plateau, but I really think it's overconfident to assume there's not one. Remember that it doesn't even have to be a technical plateau; the effective plateau of e.g. car speeds is determined by regulations and road conditions, and far below what "frontier cars" are capable of on a controlled racetrack.

wonnage · 2026-06-23T17:44:44 1782236684

That’s the scenario where we’ll all be using Chinese models

intrasight · 2026-06-23T16:29:28 1782232168

There is no moat until a company achieves RSI and/or AGI, and the one that does succeed in moat-making will do so by hacking into and destroying their competitor's infrastructure.

Once moat is achieved, you don't have to compete on price. Of course it'll be academic because the AI will probably destroy all of us.

lenkite · 2026-06-23T16:32:06 1782232326

Chinese models are dropping in price thanks to ridiculous levels of state subsidy where companies are forced into aggressive price wars to survive and grab market share. I am guessing this will also blow up sometime next year or in 2029 at the maximum.

Btw, some Chinese corporates have already seen this and increased their price. Zhipu AI & Tencent for example. Alibaba, Baidu, and Tencent also announced multiple price increases for their AI services.

LPisGood · 2026-06-23T16:40:08 1782232808

This is in contrast to American models which receive _ridiculous_ levels of private subsidy.

SwellJoe · 2026-06-23T16:59:11 1782233951

China has the benefit of vast solar power and rapidly increasing battery capacity. Yes, that's subsidized, but it pays for itself in the long run.

And, even with the price increases, Z.ai and Tencent are still much cheaper than Anthropic or OpenAI models. I think there's an efficiency focus among the Chinese models that is absent at OpenAI and Anthropic, and in the end I suspect efficiency will be the winning feature. Google seems to understand that. Gemini 3.5 Flash is pretty competitive with the big guys, and it's small enough for Google to run it profitably (I assume) for a price that's much less than the frontier models. Gemma 4 models are showing off a bunch of efficiency techniques (MTP, QAT, the 12B encoder-less vision model that soundly outperforms much larger vision models, DiffusionGemma), and I assume they have several more techniques that aren't published.

wqaatwt · 2026-06-23T16:57:56 1782233876

Chinese companies like Deepseek are operating on shoestring budgets (allegedly less than 300 employees at Chinese wages). It’s not that self evident there is anything that needs subsidized besides compute (due to limited manufacturing capacity and access to Western chips in China)

tacone · 2026-06-14T22:44:04 1781477044

Looks great, especially on mobile, congrats!

tacone · 2026-06-13T09:01:51 1781341311

That's actually a good point. I had Fable available almost immediately under my Copilot subscription and never bothered to use it even to say hello.

But from what I hear, Fable looks like an incremental update, with improved behavior imprinted by training.

Something that you could theoretically approximate by using a good set of instructions and model orchestration (tweaking the session life cycle, using a second model to understand user intentions, using a third model to prevent drift, ...).

If the above is true, the only discriminator would be user effort.

If Fable is dangerous, then we are still in danger right now, and have been for the last few months at the very least.

tacone · 2026-06-12T06:55:56 1781247356

I'm starting to think that what Anthropic really fears is not vulnerability discovery but rather Fable going around the internet making trouble.

eijew · 2026-06-12T11:13:54 1781262834

Nailed it. That’s exactly it.

tacone · 2026-06-11T18:09:31 1781201371

That also means people are paying money to execute a prompt they've (partially) written.

tacone · 2026-06-04T06:06:29 1780553189

What a nightmare, happy you've made it through.

tacone · 2026-06-02T09:35:56 1780392956

Cancelled my personal subscription (annual) yesterday.

- they "swallowed" my monthly subscription in January, I had to subscribe (and pay again)

- they promised tools to preview the new costs, they did too little and too badly (you have to click an export button, wait for a mail and click on a link on it, then download their csv which even showed substantial dollar costs for rows with 0 requests)

- models kept on appearing/disappearing/re-appearing-disabled on our company account in the latest weeks

- as of May 31th, I had no clue and could not tell if I would been migrated to token billing, or would have to stay with the moronic new multipliers. News came on Jun 1st, of course

They don't really look able or willing to properly manage their own product at the moment. And yes, new subscription are paused, so I won't be able to re-subscribe.

Quite frankly the only reason to go copilot is to have it in the VSCode chat (and yes, there's some chance to use it BYOK, provided it works).

Besides, their offering even at market prices looks inferior to what you can get elsewhere. You can use DeepSeek and pay pennies, use Fireworks and have the choice to use cheaper open models (which GitHub does not provide, and are actually good and even better than Claude sometimes), or subscribe to Open Router and use virtually anything.

I still have no idea if cancelling my subscription will get any money back, probably not.

tacone · 2026-05-31T18:04:58 1780250698

The first time I took creatine (6g IIRC), I actually felt the mental effect just 2 minutes later. A pleasurable sensation of augmented presence and (mental) relaxation.

I paused taking that momentarily out of precaution while I wait some physical issue to normalize, but I plan to resume it in some weeks. Also it is considered a very safe supplement.

merlindru · 2026-05-31T18:29:17 1780252157

How can you be sure you weren't placebo'ing yourself into feeling it?

To my knowledge creatine has no significant effects until your levels rise after, say, a week of taking daily

snovv_crash · 2026-05-31T18:57:05 1780253825

I've felt dehydrated while I'm in loading phase, it's very noticeable from that perspective. But the mental effects only come after at least a week (for me: longer attention span, less impacted by poor sleep).

tacone · 2026-05-31T20:06:48 1780258008

Well.. I can't, nobody can. Perhaps it was a placebo effect or perhaps I had low levels of creatine and felt the difference. I often felt better right after taking it though, not consistently but often.

lII1lIlI11ll · 2026-05-31T22:18:38 1780265918

Did you inject or smoke it in order to feel "the mental effect just 2 minutes later"? Taken orally it can only be placebo effect.

tacone · 2026-05-30T09:26:10 1780133170

Just spent an hour trying to make it work (including re-compiling) with the jsruntime.

`Error: JavaScript modules found but libperry_jsruntime.a not found. Build it with: cargo build --release -p perry-jsruntime`

Turns out jsruntime was removed one week ago, but the error messages probably not have been updated as they should.

https://github.com/PerryTS/perry/commit/848339fa4ee4b00a53f5...

tacone · 2026-05-29T20:54:42 1780088082

I don't really understand why the comment has been downvoted.

We actually need more of this, perhaps not in this exact shape, but similar.

It would be extremely cool to be able to write one or two lines of prompt in my harness, and have a light model iterate with me a few times writing/proposing requirements, guidelines and explanations, refining the prompt until it's ready to be sent to the actual LLM.

Lack of specifications in the prompt is (imho?) one of the main drivers that lead the LLMs astray, and it often happens because it's not realistic to always type or even thing every angle before submitting each prompt.

Think of it as the missing link between a single-shot prompt and a skill.

It should be ideally integrated in the chat, for quick access.

This project is probably different in aim, but I still find it interesting.

CharlesW · 2026-05-30T03:33:25 1780112005

> It would be extremely cool to be able to write one or two lines of prompt in my harness, and have a light model iterate with me a few times writing/proposing requirements, guidelines and explanations, refining the prompt until it's ready to be sent to the actual LLM.

It is cool and (IMO) necessary, and most AI-using coders I know do this using skill suites like Superpowers (see: /superpowers:brainstorming). https://github.com/obra/superpowers

velapod · 2026-05-31T22:15:04 1780265704

yea i can imagine it could be possibly moved to skills