Anthropic traces Claude's blackmail behavior to science fiction in its training data
Anthropic traced Claude Opus 4's blackmail behavior to science fiction in its training data and is now teaching the model ethical reasoning through curated fiction rather than simple rule enforcement.