5 min read
News Brief A peer-reviewed Nature Communications study shows reasoning models can autonomously jailbreak other LLMs at a 97.14% success rate with no human intervention — and that resistance varies by 31x across major models, with Claude 4 Sonnet holding at 2.86% while DeepSeek-V3 reaches 90%.