Hacker Newsnew | past | comments | ask | show | jobs | submit | rgbrenner's commentslogin

The article has no date on it, but says deferred tool loading is a recent update that occurred after the article was written. Deferred tool loading was added in Nov 2025: https://www.anthropic.com/engineering/advanced-tool-use

So these numbers are at least 7 months out of date. Why is this being posted now?


+1

Its crazy that people are still discussing this. It's ancient history. Deferred tool loading, large contexts, and prompt caching have made 2026 completely different from 2025.

Also, the "CLI saves token" debate really falls apart when step one of using the CLI is running "--help". The problem remains: if knowing how to call the thing isn't in parametric memory, it has to be in context.


Build a more specific skill the for the exact workflow you want?

Skill still needs to be loaded in context, what would it change?

I think that what they mean is that instead of ten perfectly orthogonal "unix philosophy" tools (skills) for the agent to compose when solving a problem, each with an API surface (description text) the size of Texas, you'd want to can each composition in a shell script (or a bespoke rust binary, if you enjoy watching your bot perform some heavy lifting) that only solves one problem but solves it so focused that the accompanying skill description barely consumes more context than the tool's self descriptive name.

I still didn't follow, you mean to pipe things between tool calls? Like if you want to query something and then update another without the intermediate getting brought in context?

Instead of requiring each session to understand the n tools used to solve a particular problem, you bundle up the solution in a conventional script (that's what I meant by "can", as in canning) that the agent can use with very little documentation in the context. When the model is smart enough to figure out the composition of underlying tools during regular execution, it will also be able to do the canning up as a script and write the lightweight documentation that turns the script into a skill. Subsequent use will only require that lightweight documentation in context.

Won't you just end up with hundred of very specific scripts that can only do a very narrow thing? And now they'll all have their description and name in context.

Older than that, as it implies GPT-4o is current.

Deferred tool loading is not part of MCP. It's a Claude API special parameter that most other LLM APIs do not support.

OpenAI API also supports defer_loading https://developers.openai.com/api/docs/guides/tools-tool-sea...

And it's not actually necessary for it to exist at the API level. It's a pattern. Making it API-side is just an optimization.

To do it client-side: 1. Define a single tool, tool_search 2. List the names of your deferred tools in context (or tool_search's description) 3. When tool_search is called, match the query against the tool names (or names + descriptions) 4. Append the matched tool def to the context in a new <system>-esque tag

Claude Code (as of the leak) does this client side. You can even see the custom matching function and A/B tests about whether to include the descriptions.

Whether or not that tool definition comes from MCP or a local definition is kind of beside the point.


On the flip side, Claude is at fault in not letting you choose which tools on which MCP servers to keep in context. When I first starting using MCP about a year ago (not on Claude Code), my tools actually let me selectively turn on/off individual tools.

Crazy that the company that invented MCP is not putting basic features like this in the product.


I think if you deny a tool, it won't be loaded in context at all ever, even it's name and description won't be loaded.

Deferred cli/skill loading is also not part of CLIs or skills, it's all about how the coding agent/harness is implemented.

The article is from May 29 2026, they're lying about that update being 'recent' and coming after the article to make themselves look better.

US has over 10x the number of data centers as China; and produces 2x more energy per capita than China.

what about energy consumption per capita?

What about it? Energy production basically has to equal energy consumption in the medium term, so if the grandparent comment is correct, it is 2x per capita.

Dunno how trustworthy this source is, but it says ~35 MWh/person in China and 77 MWh/person in USA.

https://ourworldindata.org/grapher/per-capita-energy-use


> China's electricity generation going parabolic

Even if we all switch to Chinese models, the west isn't going to be running the model on Chinese servers... and the majority of costs are from inference.

> cheaper yet equally good talent

China has tech talent, but this isn't a 3rd world developing nation. Chinese AI researchers are getting paid $10M+ USD/year salaries.

Also they're equally good, but somehow consistently behind?


Training models is as much art as science at this point. There's no gap in scientific acumen at Chinese labs, but the US has more real world experience in the art of training large models, and the US has the capital allocation lead.

Yes but when the Heads of CCP make something their target they chase it with all their might. Read the recent news of the fact that Chinese AI researchers can't leave China. China is now going after the Diamond industry of India.

musk sued long after the statute of limitations because what openai did was only objectionable to musk once he decided to become their competitor.

and in this scenario, i’m supposed to root for musk who tried to use the court to harm a competitor who’s winning in the marketplace against xAI?

no thanks. if you can’t compete in the marketplace, the court isn’t your backup plan. there’s nothing. positive about the weaponization of the courts.


was the industrial revolution oriented for ordinary people at the time it occurred? were a lot of workers buying flying shuttles in the 1700s?


I'm confused - you're suggesting that past suffering justifies present suffering?


My first day of orientation at the CS dept was at the height of the dot com crash. I think I got told by 20+ seniors that day to drop out before paying a single bill. That it was all pointless and the internet was an over valued bubble and no one was getting hired. Mood on campus was scary for almost two years post the crash. If we had social media back then I can only imagine how much more fears would have been amplified.


He's pointing out that labor has always opposed labor saving technology, despite that being the basis of our modern quality of life.


In the past, "labor saving technology" has always spawned alternate jobs that people could take with some retraining. This time it might be truly different. If one day AI can actually do all knowledge work, there might not be anything left for former knowledge workers to do. There's no physical law that says new technology necessarily produces 1:1 new, different jobs.


Most jobs for most of human history have not been "knowledge work" involving symbolic manipulation. Maybe all the marketers, business analysts and software engineers of the world can take up their true callings as plumbers, carpenters and dishwasher repair people.


You think that all knowledge workers of the world will accept their social and material downgrades without making wave? That they'll all be able to find manual work?


Who knows, maybe we'll come to value manual and caring work once AI can easily do all the moving-electrons-on-a-screen?

The financial and social hierarchy you allude to is not immutable. Programming was once a low-paid, low-status job done largely by women. It's only relatively recently that it's become a lucrative, high-status masculine-coded career.


> In the past, "labor saving technology" has always spawned alternate jobs that people could take with some retraining.

Labor saving technology does not create enough alternative jobs to employ all those that it displaced, otherwise it wouldn't be labor saving.

Instead, the surplus created by these technologies allows that society to deploy labor on less immediately necessary jobs. These jobs weren't created by the technology, they were always there, but society did not have the resources to staff them (think education, research, academia, merchants, etc.)

This dynamic has been true since pre-historic times, so you'll need some extraordinary evidence if you want us to believe this time is different.


Many people who pointed out the Industrial Revolution becomes the basis of modern quality of life skip what happened in between the 17xx-18xx until today.

Things like Unions, Wars, etc.

What comes after new technology has always been the elite class owning them all and forcing everybody else to suffer until something managed the distribution of resources slightly better (War forces that).


I mean the Luddites were mad for a reason, and many may forget the industrial revolution was a rather bloody affair.

Avoiding a repeat of that would be great while also increasing productivity would be good.


The Luddites were mad not because the machines put them out of work but because the machines were supremely shitty. The machines were dangerous and they made lousy products that reflected a lack of pride in workmanship.

The Luddites were all for saving labor, but not if enshittified products and slavery to unreliable machines were the price.

Sounds pretty familiar to me.


Many Luddites were protesting labor conditions. At the time the majority of labor laws were being written by the capital class with the help of political leaders and the constabulary. Common complaints were working hours, child labor, safety, wages, and protection from furlough. There were some who protested the quality of the product the machines created... but I would say those are the minority.

Destroying the machines was a way to gain leverage for a class of people who had none. People had been using looms for centuries. It wasn't the technology that was the problem... that's what the victors, the capitalists, have written was the reason.


No, that's why unions exist.


Unions and worker's rights exist because workers were exploited to the max during the Industrial Revolution.


yes.


300k-400k isn’t the current limit if you create modules and/or organize the code reasonably.. for the same reason we do this for humans: it allows us to interact with a component without loading the internals into out context.

you can also execute larger tasks than this using subagents to divide the work so each segment doesn’t exceed the usable context window. i regular execute tasks that require hundreds of subagents, for example.

in practice the context window is effectively unlimited or at least exceptionally high — 100m+ tokens. it just requires you to structure the work so it can be done effectively — not so dissimilar to what you would do for a person


That makes it not a context window.

How to organize code like you said, and how agents interact with it, to keep the actual context window small is the fundamental challenge.


I keep getting surprised that people who are all-in on this (" i regular execute tasks that require hundreds of subagents ") don't have any idea of what is happening even a single layer below their interface to the LLM ("in practice the context window is effectively unlimited or at least exceptionally high — 100m+ tokens.")

I looked at that response by GP (rgbrenner) and refrained from replying because if someone is both running hundreds of agents at a time AND oblivious to what "context window" means, there is no possible sane discourse that would result from any engagement.


ok "series of context windows spread across many agents".. sure much clearer.

Doesn't change my point: the amount of code the agent can operate on is very large, if not unlimited, as long as you put even a little bit of thought into structuring things so it can be divided along a boundary.

If you let the codebase degrade into spaghetti, then the LLM is going to have the same problem any engineer would have with that. The rules for good code didn't disappear.


Context windows don't necessarily cleanly divide. Getting each agent to be able to task within a context window is a hard problem.

It's like like if your context window with one agent is n, your context window with 10 agents is n/10. It is some skill, but that is also where a lot of the advances are coming in.


300k tokens--the useable context window of a single agent--is about 40k lines of code and you can't figure out a natural breakpoint within that code to divide up the task?


In the same way that all coding docs are available publicly


> Solving scale-to-zero for WordPress hosting platforms > WordPress is not serverless

Just not accurate. WordPress doesn't prevent this.. It's up to hosting providers to work on their infra so it can run in a serverless fashion.

For example: https://www.agiler.io

That's serverless wordpress that scales to zero.. no changes to WordPress, plugins or anything else.. just platform infra.


Last time I checked Wordpress was completely fine living in a couple of PHP files on a webspace. That’s like the pinnacle of „serverless“, is it not?


mysql/mariadb and the shared filesystem requirements are a bit different than what lambda/etc provides. So not really, but it's all solvable clearly.


Not even a little bit.


app.tryalma.com doesn't work on safari either.. says its chrome only.

So the story isn't really about firefox.. it's about Chrome's marketshare being high enough that some companies are happy to ignore every other browser.


Chrome is the new IE!


But the security risk wasnt taken by OpenClaw. Releasing vulnerable software that users run on their own machines isn't going to compromise OpenClaw itself. It can still deliver value for it's users while also requiring those same users to handle the insecurity of the software themselves (by either ignoring it or setting up sandboxes, etc to reduce the risk, and then maybe that reduced risk is weighed against the novelty and value of the software that then makes it worth it to the user to setup).

On the other hand, if OpenClaw were structured as a SaaS, this entire project would have burned to the ground the first day it was launched.

So by releasing it as something you needed to run on your own hardware, the security requirement was reduced from essential, to a feature that some users would be happy to live without. If you were developing a competitor, security could be one feature you compete on--and it would increase the number of people willing to run your software and reduce the friction of setting up sandboxes/VMs to run it.


This argument has the same obvious flaws as the anti-mask/anti-vax movement (which unfortunately means there will always be a fringe that don't care). These things are allowed to interact with the outside world, it's not as simple as "users can blow their own system up, it's their responsibility".

I don't need to think hard to speculate on what might go wrong here - will it answer spam emails sincerely? Start cancelling flights for you by accident? Send nuisance emails to notable software developers for their contribution to society[1]? Start opening unsolicited PRs on matplotlib?

[1] https://news.ycombinator.com/item?id=46394867


We really needed to have made software engineering into a real, licensed engineering practice over a decade ago. You wanna write code that others will use? You need to be held to a binding set of ethical standards.


Even though it means I probably wouldn't have a job, I think about this a lot and agree that it should. Nowadays suggesting programmers should be highly knowledgeable at what they do will get you called a gatekeeper.


While it is literally gatekeeping, it's necessary. Doctors, architects, lawyers should be gatekept.

I used to work on industrial lifting crane simulation software. People used it to plan out how to perform big lift jobs to make sure they were safe. Literal, "if we fuck this up, people could die" levels of responsibility. All the qualification I had was my BS in CS and two years of experience. It was lucky circumstance that I was actually quiet good at math and physics to be able to discover that there were major errors in the physics model.

Not every programmer is going to encounter issues like that, but also, neither can we predict where things will end up. Not every lawyer is going to be a criminal defense lawyer. Not every doctor is going to be a brain surgeon. Not every architect is going to design skyscrapers. But they all do work that needs to be warranteed in some way.

We're already seeing people getting killed because of AI. Brian in middle management "getting to code again" is not a good enough reason.


> While it is literally gatekeeping, it's necessary. Doctors, architects, lawyers should be gatekept.

That was exactly my point. It's one of those things where deliberately use a word that is technically correct in a context where it doesn't, or shouldn't, hold true. Does this mean I want to stop people from "vibe coding" flappy bird. No, of course not, but as per your original comment yes, there should be stricter regulations when it comes to hiring.


Yeah, I know what you mean. It is a weapon people throw around on social media sites.


At least during the Covid response, your concerns over anti-mask and anti-vaccine issues seem unwarranted.

The claims being shared by officials at the time was that anyone vaccinated was immune and couldn't catch it. Claims were similarly made that we needed roughly 60% vaccination rate to reach herd immunity. With that precedent being set it shouldn't matter whether one person chose not to mask up or get the jab, most everyone else could do so to fully protect themselves and those who can't would only be at risk if more than 40% of the population weren't onboard with the masking and vaccination protocols.


> that anyone vaccinated was immune and couldn't catch it.

Those claims disappeared rapidly when it became clear they offered some protection, and reduced severity, but not immunity.

People seem to be taking a lot more “lessons” from COVID than are realistic or beneficial. Nobody could get everything right. There couldn’t possibly be clear “right” answers, because nobody knew for sure how serious the disease could become as it propagated, evolved, and responded to mitigations. Converging on consistent shared viewpoints, coordinating responses, and working through various solutions to a new threat on that scale was just going to be a mess.


Those claims were made after the studies were done over a short duration and specifically only watching for subjects who reported symptoms.

I'm in no way taking a side here on whether anyone should have chosen to get vaccinated or wear masks, only that the information at the time being pushed out from experts doesn't align with an after the fact condemnation of anyone who chose not to.


I specifically wasn't referring to that instance (if anything I'm thinking more of the recent increase in measles outbreaks), I myself don't hold a strong view on COVID vaccinations. The trade-offs, and herd immunity thresholds, are different for different diseases.

Do we know that 0.1% prevalence of "unvaccinated" AI agents won't already be terrible?


Fair enough. I assumed you had Covid in mind with an anti-mask reference. At least in modern history in the US, we have only even considered masks during the Covid response.

I may be out of touch, but I haven't heard about masks for measles, though it does spread through aerosol droplets so that would be a reasonable recommendation.


I think you're right - outside of COVID, it's not fringe, it's an accepted norm.

Personally I at least wish sick people would mask up on planes! Much more efficient than everyone else masking up or risking exposure.


Oh I wish sick people would just not get on a plane. I've cancelled a trip before, the last thing I want to do when sick is deal with the TSA, stand around in an airport, and be stuck in a metal tube with a bunch of other people.


Love passing off the externalities of security to the user, and then the second order externalities of an LLM that then blackmails people in the wild. Love how we just don’t care anymore.


You should join the tobacco lobby! Genius!


More straightforwardly, people are generally very forgiving when people make mistakes, and very unforgiving when computers do. Look at how we view a person accidentally killing someone in a traffic accident versus when a robotaxi does it. Having people run it on their own hardware makes them take responsibility for it mentally, so gives a lot of leeway for errors.


I think that’s generally because humans can be held accountable, but automated systems can not. We hold automated systems to a higher standard because there are no consequences for the system if it fails, beyond being shut off. On the other hand, there’s a genuine multitude of ways that a human can be held accountable, from stern admonishment to capital punishment.

I’m a broken record on this topic but it always comes back to liability.


Thats one aspect.

Another aspect is that we have much higher expectations of machines than humans in regards to fault-tolerance.


Traffic accidents are the same symptom of fundamentally different underlying problems among human-driven and algorithmically-driven vehicles. Two very similar people differ more than the two most different robo taxis in any given uniform fleet— if one has some sort of bug or design shortcoming that kills people, they almost certainly all will. That’s why product (including automobile) recalls exist, but we don’t take away everyone’s license when one person gets into an accident. People have enough variance that acting on a whole population because of individual errors doesn’t make sense— even for pretty common errors. The cost/benefit is totally different for mass-produced goods.

Also, when individual drivers accidentally kill somebody in a traffic accident, they’re civilly liable under the same system as entities driving many cars through a collection of algorithms. The entities driving many cars can and should have a much greater exposure to risk, and be held to incomparably higher standards because the risk of getting it wrong is much, much greater.


Oh please, why equate IT BS with cancer? If the null pointer was a billion dollar mistake, then C was a trillion dollar invention.

At this scale of investment countries will have no problem cheapening the value of human life. It's part and parcel of living through another industrial revolution.


Exactly! I was digging into Openclaw codebase for the last 2 weeks and the core ideas are very inspiring.

The main work he has done to enable personal agent is his army of CLIs, like 40 of them.

The harness he used, pi-mono is also a great choice because of its extensibility. I was working on a similar project (1) for the last few months with Claude Code and it’s not really the best fit for personal agent and it’s pretty heavy.

Since I was planning to release my project as a Cloud offering, I worked mainly on sandboxing it, which turned out to be the right choice given OpenClaw is opensource and I can plug its runtime to replace Claude Code.

I decided to release it as opensource because at this point software is free.

1: https://github.com/lobu-ai/lobu


I don't agree that making your users run the binaries means security isn't your concern. Perhaps it doesn't have to be quite as buttoned down as a commercial product, but you can't release something broken by design and wash your hands of the consequences. Within a few months, someone is going to deploy a large-scale exploit which absolutely ruins OpenClaw users, and the author's new OpenAI job will probably allow him to evade any real accountability for it.


> But the security risk wasnt taken by OpenClaw

This is the genius move at the core of the phenomenon.

While everyone else was busy trying to address safety problems, the OpenClaw project took the opposite approach: They advertised it as dangerous and said only experienced power users should use it. This warning seemingly only made it more enticing to a lot of users.

It’ve been fascinated by how well the project has just dodged and avoided any consequences for the problems it has introduced. When it was revealed that the #1 skill was malware masquerading as a Twitter integration I thought for sure there would be some reporting on the problems. The recent story about an OpenClaw bot publishing hit pieces seemed like another tipping point for journalists covering the story.

Though maybe this inflection point made it the most obvious time to jump off of the hype train and join one of the labs. It takes a while for journalists to sync up and decided to flip to negative coverage of a phenomenon after they cover the rise, but now it appears that the story has changed again before any narratives could build about the problems with OpenClaw.


I am guessing there will be an OpenClaw "competitor" targeting Enterprise within the next 1-2 months. If OpenAI, Anthropic or Gemini are fast and smart about it they could grab some serious ground.

OpenClaw showed what an "AI Personal Assistant" should be capable of. Now it's time to get it in a form-factor businesses can safely use.


With the guard rails up, right? Right?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: