Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The idea of them purposefully wasting my time by having the model act dumber and me having to argue with it without knowing if it’s the prompt or the model was just such an idiotic product decision I can’t believe they shipped that without getting any feedback from users first.


[flagged]


Safety from what? Competitors? That sounds like a product decision. They're puking on any requests that could be used to create LLMs or competitive products.


I would guess prevention of using Claude as a pentesting or hacking platform. This could mean that every script kiddie out there would be a massive risk.


To prevent their models from doing harm in dual-use contexts including CBRN or by accelerating research in authoritarian-backed AI labs.


Anything to prevent mecha ai hitler. At all costs


The road to hell is paved with "good" intentions.


I think you can sympathize with the safety motives while still thinking this was a dumb implementation to degrade silently? I actually have faith in them getting the guardrail triggers pretty good, but consensus seems like they’re not yet there yet.


I think it is clear given the stakes why you would not want to make your guardrails probe-able/invertable.


> if you understood what they think they are building and the culture inside of anthropic you would understand why they did it.

This seems like a cult with extra steps.

Related: I interviewed for Anthropic a few months ago and in place of the usual HR call they have one where they have someone with a suspiciously relevant degree grill you about how committed you are to the 'mission'!

I probably came off as being skeptical, and then, hilariously, I was strongly encouraged to read the book published by the CEO to 'form accurate opinions' on AI safety.


Don't buy it. It is actively deceiving the customer and charging them for the privilige of being lied to.


We do understand why they did it, and the reason is dark and cynical.


They did it to make more money as you waste more time burning tokens with bad responses.


[flagged]


How does degrading responses to a cheaper tier jack up revenues?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: