Hacker Newsnew | past | comments | ask | show | jobs | submit | salomonk_mur's commentslogin

Which was for the most part true.


Man, 99% of non-bug-fix commits don't have a why other than "advance the current task".

Almost all commits live in tandem with some large feature or change being made. The reason for absolutely all of them is the same - build the thing .


>other than "advance the current task"

How do you expect someone to know what “the current task” was when they’re tracking down a bug 2 years down the line?


Then write that and link to the current task. That's the why. You don't need an LLM for that.


Perhaps this is about commit granularity. If keeping the history about advancing the task is not useful, then I’d merge these commits together before merging the PR; in some workflows this is set up to happen automatically too.


Maybe what we need is a pre-commit hook that prefixes every commit with the name of the branch it's being made onto


You have got to have some extremely large files or something. Even with only Opus, running into the limits with the Max subscription is almost impossible unless you really try.


You don't want automatic browsing of tedious tasks? I really do.


Not if it's not local. I don't want my browser to be an automated snitch for palantir


Automating tedious tasks is great, as long as it's reliable. We know how to build reliable integrations and reliable automations. Making chat bots a page and click buttons it thinks will do the right thing is never gonna be reliable.


I certainly don’t want AI to buy groceries for me while I’m “busy” doing something else.


I wouldn't mind help with grocery orders. I like to check which apples are on special and maybe buy a different variety from normal depending on the price.

My grocery store makes this really tedious because they don't have a feature to sort by price per pound. So I have a stupid ritual where I ctrl-F "($0." and repeatedly ctrl-G to see all the apples under $1/pound. Then I do it again with ctrl-F "($1." to see the ones in the $1-$2/pound price range. And there are several other products with similar annoying processes.

If an AI could just do that for me, it would save me time. I don't actually think present-day AI would do it reliably enough, but the concept sounds fine.


My grocery store makes this really tedious because they don't have a feature to sort by price per pound. So I have a stupid ritual where I ctrl-F "($0." and repeatedly ctrl-G to see all the apples under $1/pound. Then I do it again with ctrl-F "($1." to see the ones in the $1-$2/pound price range. And there are several other products with similar annoying processes.

If an AI could just do that for me, it would save me time.

I just look at the flyers that come in the Sunday newspaper. Problem solved in under 30 seconds with near-zero effort.

Cheaper than an AI subscription, too.


I wonder how the economics will work for this? How will the AI service providers make money off of it?


It's like using Alexa to shop, or when an Amazon ad comes on it and saves to the cart, only your order has a delivery cost and it's perishable


Reminds me of the one-button Amazon magnets that you'd stick to the refrigerator.

Press the "Gatorade" button, and Gatorade shows up at the front door the next day. Except that it might cost you $2.49, or it might cost $12.88. There was no way of knowing beforehand.

(Those things were awesome tech, BTW. You'd program it with your phone via the equivalent of an acoustic modem.)


like what, reading & commenting on HN?


only when im in need.


The story of literally all blockchain-based solutions.


Imagine you do it with PG - add a column “money”, put some numbers into it and issue a ToS guaranteeing money in your db 1-to-1 exchange to USD. Because now you store money amount in your db and can manipulate them at will you have to be a bank. Good luck with that.


Fuck that shit.


Hard disagree. LLMs are now incredibly good for any coding task (with popular languages).


What's your explanation for why others report difficulty getting coding agents to produce their desired results?

And don't respond with a childish "skill issue lol" like it's Twitter. What specific skill do you think people are lacking?


In no particular order: LLMs seem, for some reason, to be worse at some languages than others.

LLMs only have so much context available, so larger projects are harder to get good results in.

Some tools (eg a fast compiler) are very useful to agents to get good feedback. If you don't have a compiler, you'll get hallucinations corrected more slowly.

Some people have schedules that facilitate long uninterrupted periods, so they see an agent work for twenty minutes on a task and think "well I could've done that in 10-30 minutes, so where's the gain?". And those people haven't understood that they could be running many agents in parallel (I don't blame people for not realizing this, no one I talk to is doing this at work).

People also don't realize they could have the agent working while they're asleep/eating lunch/in a meeting. This is why, in my experience, managers find agents more transformative than ICs do. We're in more meetings, with fewer uninterrupted periods.

People have an expectation that the agent will always one-shot the implementation, and don't appreciate it when the agent gets them 80% of the way there. Or that, it's basically free to try again if the agent went completely off the rails.

A lot of people don't understand that agents are a step beyond just an LLM, so their attempts last year have colored their expectations.

Some people are less willing to attempt to work with the agent to make it better at producing good output. They don't know how to do it. Your agent got logging wrong? Okay, tell it to read an example of good logging and to write a rule that will get it correct.


Not OP but my two cents - probably laziness and propensity towards daydreaming.

I have extreme intolerance to boredom. I can't do the same job twice. Some people don't care.

This pain has caused me to become incredibly effective with LLMs because I'm always looking for an easier way to do anything.

If you keep hammering away at a problem - i.e. how to code with LLMs - you tend to become dramatically better at other people who don't do that.


Thought experiment: you can ride a bike. You can see other people ride bikes. Some portion of people get on a bike and fall off, then claim that bikes are not useful for transportation. Specify what skill they are lacking without saying 'ability to ride a bike'.


For a bike? Balance, fine motor control, proprioception, or even motivation. You can always break it down.


Knowing those things won't help them acquire the skill. What will help them be able to ride a bike is practicing trying to riding a bike until they can do it.


You can't disagree with facts. Every time I try to give a chance to all those LLMs, they always use old APIs, APIs that don't exist, or mix things up. I'll still try that once a month to see how it evolves, but I have never been amazed by the capabilities of those things.

> with popular languages

Don't know, don't care. I write C++ code and that's all I need. JS and React can die a painful death for all I care as they have injected the worst practices across all the CS field. As for Python, I don't need help with that thanks to uv, but that's another story.


If you want them to not make shit up, you have to load up the context with exactly the docs and code references that the request needs. This is not a trivial process and ime it can take just as long as doing stuff manually a lot of the time, but tools are improving to aid this process and if the immediate context contains everything the model needs it won't hallucinate any worse than I do when I manually enter code (but when I do it, I call it a typo)

there is a learning curve, it reminds me of learning to use Google a long time ago


So, I've done this, I've pasted in the headers and pleaded with it to not imagine ABIs that don't exist, and multiple models just want to make it work however they can. People shouldn't be so quick to reply like this, many people have tried all this advice... It also doesn't help that there is no independent test that can describe these issues, so all there is anecdote to use a different vendor or that the person must be doing something wrong? How can we talk about these things with these rhetorical reflexes?


There is a significant gap between agents and models.

Agents use multiple models, can interact with the environment, and take many steps. You can get them to reflect on what they have done and what they need to do to continue, without intervention. One of the more important things they can do is understand their environment, the libraries and versions in use, fetch or read the docs, and then base their edits on those. Much of the hallucinating SDKs can be removed with this, and with running compile to validate, they get even better.

Models typically operate in a turn-by-turn basis with only the context and messages the user provides.


You can't make any guarantees and manually watching everything is not tenable. "Much" instead of "all" means having to check it all because "much" is random.


You don't have to watch it like you don't have to watch your peers. We have code review processes in place already

You're never going to get all, you don't have all today. Humans make mistakes too and have to run programs to discover their errors


> You can't disagree with facts. Every time I...

Anecdotes are not facts, they are personal experiences, which we know are not equal and often come with biases


Add "Look up version 4 of the library, make sure to use that version".

My Python work has to be told we are using uv, and sometimes that I am on a mac. This is not that different to what you would have to tell another programmer, not familiar with you tools.


We, too, are just auto-complete, next-token machines.


We are auto-complete next-token machines, but plastic and attached to many other not less important subsystems, which is a crucial difference.


Here in Colombia it's almost a staple to find perpetually-empty restaurants that never go broke. We call them "lavaderos" - kinda like "laundering-stations".


Have that here in Austria from time to time as well.

Funniest case I visited was a Chinese restaurant where the waitress always wore a winter jacket, because they couldn’t be bothered heating the place.


Only in theory. Most times none of those stakeholders get anything but wages and recognition.


What do you mean by anything?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: