The idea of them purposefully wasting my time by having the model act dumber and me having to argue with it without knowing if it’s the prompt or the model was just such an idiotic product decision I can’t believe they shipped that without getting any feedback from users first.
Safety from what? Competitors? That sounds like a product decision. They're puking on any requests that could be used to create LLMs or competitive products.
I would guess prevention of using Claude as a pentesting or hacking platform. This could mean that every script kiddie out there would be a massive risk.
I think you can sympathize with the safety motives while still thinking this was a dumb implementation to degrade silently? I actually have faith in them getting the guardrail triggers pretty good, but consensus seems like they’re not yet there yet.
> if you understood what they think they are building and the culture inside of anthropic you would understand why they did it.
This seems like a cult with extra steps.
Related: I interviewed for Anthropic a few months ago and in place of the usual HR call they have one where they have someone with a suspiciously relevant degree grill you about how committed you are to the 'mission'!
I probably came off as being skeptical, and then, hilariously, I was strongly encouraged to read the book published by the CEO to 'form accurate opinions' on AI safety.