Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> One continuous difference: while GPT-5 would do lots of thinking then do something right the first time, Claude frantically tried different things — writing code, executing commands, making pretty dumb mistakes [...], but then recovering. This meant it eventually got to correct implementation with many more steps.

Sounds like Claude muddles. I consider that the stronger tactic.

I sure hope GPt-5 is muddling on the backend, else I suspect it will be very brittle.

Re: https://contraptions.venkateshrao.com/p/massed-muddler-intel...

> Lindblom’s paper identifies two patterns of agentic behavior, “root” (or rational-comprehensive) and “branch” (or successive limited comparisons), and argues that in complicated messy circumstances requiring coordinated action at scale, the way actually effective humans operate is the branch method, which looks like “muddling through” but gradually gets there, where the root ["godding through"] method fails entirely.



Today I used GPT-5 for some OpenTelemetry Collector configs that both Claude and OpenAI models struggled with before and it was surprisingly impressive. It got the replies right on the first try. Previously, both had been tripped up by outdated or missing docs (OTel changes so quickly).

For home projects, I wish I could have GPT-5 plugged into Claude’s code CLI interface. iteration just works! Looking forward to less baby sitting in the future!



Any success using this with GPT-5? I got it set up but haven't had a chance to run it through its paces yet. Seemed like it was more or less working when I tried it out, but GPT-5 was much less transparent about progress.


Cursor CLI is pretty close to Claude code- it’s missing a bunch of features like being able to manually compact or define sub agents, but the basic workflow is there and if you squint it’s pretty close to gpt-5 in Claude code.

I haven’t tried codex cli recently yet, I think it just got an update. That would be another to investigate.


Isn't the issue with that the prohibitive costs, it can easily be 5 to 10 (maybe even more for long running tasks). Currently they are probably subsidizing the compute costs to some extent.


yeah gpt-5 does lots of thinking and then does something -- but it's rarely the right thing, at least in my experience over the past day




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: