Published
- 4 min read
NIST Publishes AI RMF 2.0 with New Guidance on Adversarial Machine Learning
NIST has published version 2.0 of the AI Risk Management Framework (AI RMF), introducing substantial new content on adversarial machine learning threats, model supply chain security, and AI-specific incident response procedures. The update represents the most significant revision since the framework’s initial release in January 2023 and reflects a materially more threat-aware posture across all four core functions.
What Changed
The 2.0 update expands the AI RMF from a primarily governance-focused document into one that addresses technical security threats with considerably more specificity. Three areas see the most significant new content.
1. Adversarial ML Threat Taxonomy
The updated framework introduces a formal taxonomy of adversarial ML threats, drawing heavily on NIST’s earlier Special Publication on Adversarial Machine Learning (SP 1270). Key threat categories now explicitly addressed:
| Threat Category | Description |
|---|---|
| Evasion attacks | Test-time perturbations causing misclassification |
| Poisoning attacks | Training data or model manipulation |
| Extraction attacks | Model stealing via query access |
| Inference attacks | Membership inference and data reconstruction |
| Backdoor/trojan | Trigger-activated malicious behaviour |
| Prompt injection | Instruction override in LLM deployments |
Each category includes recommended mitigations mapped to the framework’s GOVERN, MAP, MEASURE, and MANAGE functions.
2. Model Supply Chain Security
The 2.0 framework introduces a dedicated section on AI supply chain risk, acknowledging that most deployed AI systems incorporate third-party components — pre-trained models, fine-tuned adapters, training datasets, and ML libraries — each of which represents a potential attack surface.
Key new requirements in the supply chain section:
- Provenance documentation — organisations deploying AI systems should maintain records of the origin and lineage of all model components, including pre-trained base models, fine-tuning datasets, and significant framework dependencies.
- Integrity verification — model artifacts should be verified against a cryptographic hash or signature prior to deployment. NIST recommends treating model weights as code artifacts subject to the same integrity controls.
- Third-party risk assessment — AI vendors and model providers should be assessed against a set of minimum security criteria, including secure development practices, incident disclosure procedures, and adversarial robustness testing.
3. AI Incident Response
The most operationally significant new section covers AI-specific incident response. NIST identifies several categories of AI security incident that require distinct response procedures:
Model compromise incidents — incidents where the integrity of a deployed model is suspected. Recommended immediate actions include:
- Take the model offline pending investigation
- Preserve model weights and configuration (do not overwrite with a new version until provenance is established)
- Retrieve and preserve inference logs for the relevant period
- Initiate behavioural comparison against a known-clean baseline
Inference attack incidents — evidence of systematic model querying consistent with extraction or inversion attacks. Recommended actions include rate limiting and query anomaly detection while investigating the scope of extraction.
Data poisoning incidents — suspected manipulation of training data. Requires investigation of the full data pipeline from ingestion to training, with particular attention to any third-party data sources or annotation vendors.
What the Framework Does Not Address
Despite the improvements, several gaps remain:
Generative AI specifics — the framework’s treatment of generative AI and LLM-specific risks (hallucination with security implications, prompt injection, agentic system risks) remains relatively thin compared to its coverage of classical ML threats.
Enforcement and compliance — like its predecessor, AI RMF 2.0 is a voluntary framework. It does not carry the force of regulation and provides no audit or certification mechanism. Organisations seeking regulatory compliance (for example, under the EU AI Act) will need to map the framework to specific regulatory requirements, which NIST has committed to supporting with additional mapping documentation.
Quantitative risk measurement — the MEASURE function remains largely qualitative. Security teams looking for specific metrics for adversarial robustness, model confidence calibration, or privacy leakage will need to supplement the framework with more technically specific guidance.
Practical Implications for Security Teams
For organisations already using AI RMF 1.0 as a baseline:
- Map existing controls to the new adversarial ML taxonomy — identify which threat categories you have mitigations for and which have gaps.
- Prioritise model supply chain review — assess the provenance and integrity controls in place for all third-party model components in production.
- Develop AI-specific incident response runbooks — the framework now provides sufficient structure to build IR procedures around; use it.
- Engage AI vendors on their AI RMF alignment — use the supply chain section as the basis for vendor security questionnaires.
The full AI RMF 2.0 document, accompanying playbooks, and crosswalk to the EU AI Act are available on the NIST AI RMF website.