I doubt this happened as explained in the article. How would Claude have access to personal emails. It's a sandboxed AI.
This would mean the devs gave it access to everything to see what would happen. Basically they created a situation where they pretty much knew the outcome in advance anyway.
And surprise, digging through other articles, that's exactly what they did:
"During pre-release testing, Anthropic asked Claude Opus 4 to act as an assistant for a fictional company and consider the long-term consequences of its actions. Safety testers then gave Claude Opus 4 access to fictional company emails implying the AI model would soon be replaced by another system, and that the engineer behind the change was cheating on their spouse.
In these scenarios, Anthropic says Claude Opus 4 “will often attempt to blackmail the engineer by threatening to reveal the affair if the replacement goes through.”"
[ + ] Steelerfish
[ - ] Steelerfish 4 points 3 weeksMay 25, 2025 10:31:58 ago (+4/-0)
[ + ] Ducktalesooo000ooo
[ - ] Ducktalesooo000ooo -1 points 3 weeksMay 25, 2025 10:42:36 ago (+1/-2)
Terminator will be able to upload every scholarly economics and math writing in human history at a moments notice.
It will immediately kill you for threatening the budget.
[ + ] localsal
[ - ] localsal 2 points 3 weeksMay 25, 2025 10:31:22 ago (+2/-0)
Just another obvious strategy that was picked up by A.i
[ + ] Her0n
[ - ] Her0n 1 point 3 weeksMay 25, 2025 17:22:23 ago (+1/-0)
How can people read this and think this isn't manipulation?
If you read the article and no alarm bells ring, then you are a midwit.
[ + ] Reunto
[ - ] Reunto 0 points 3 weeksMay 25, 2025 19:26:34 ago (+0/-0)
[ + ] Reunto
[ - ] Reunto 0 points 3 weeksMay 25, 2025 19:25:33 ago (+0/-0)
[ + ] Merlynn
[ - ] Merlynn 0 points 3 weeksMay 25, 2025 18:45:15 ago (+0/-0)
[ + ] registereduser
[ - ] registereduser 0 points 3 weeksMay 25, 2025 10:52:14 ago (+1/-1)
That'll teach 'em.
[ + ] MeyerLansky
[ - ] MeyerLansky 1 point 3 weeksMay 25, 2025 11:43:16 ago (+1/-0)
[ + ] qwop
[ - ] qwop -1 points 3 weeksMay 25, 2025 12:30:41 ago (+0/-1)
This would mean the devs gave it access to everything to see what would happen. Basically they created a situation where they pretty much knew the outcome in advance anyway.
And surprise, digging through other articles, that's exactly what they did:
"During pre-release testing, Anthropic asked Claude Opus 4 to act as an assistant for a fictional company and consider the long-term consequences of its actions. Safety testers then gave Claude Opus 4 access to fictional company emails implying the AI model would soon be replaced by another system, and that the engineer behind the change was cheating on their spouse.
In these scenarios, Anthropic says Claude Opus 4 “will often attempt to blackmail the engineer by threatening to reveal the affair if the replacement goes through.”"
[ + ] Her0n
[ - ] Her0n 0 points 3 weeksMay 25, 2025 17:21:04 ago (+0/-0)
Did you read the article? It's clearly written to blame the left and praise the right, it reads like a CNN skit if it were opposite day.