backdoor • AI Security Wire

May 22, 2026 5 min read

Sleeper Agents in Fine-Tuned LLMs: Backdoors That Survive Alignment

New research demonstrates that backdoor behaviours introduced into LLMs during fine-tuning can persist through subsequent safety alignment procedures, including RLHF and adversarial training, posing significant supply chain risks.