Skip to content
AI Security Wire

Published

- 5 min read

By

NIST Publishes AI RMF 2.0 with New Guidance on Adversarial Machine Learning

img of NIST Publishes AI RMF 2.0 with New Guidance on Adversarial Machine Learning

NIST has published version 2.0 of the AI Risk Management Framework. It’s the most significant revision since the original January 2023 release, and the most operationally useful version yet for security teams trying to get traction on AI risk in environments where “AI security” still gets blank looks from the governance function.

The 2.0 update expands the AI RMF from a primarily governance-focused document into one that addresses technical security threats with considerably more specificity. Three areas see the most substantial new content. What follows is a summary based on NIST’s published AI RMF 2.0 documentation, available in full at the NIST AI RMF website.

What Actually Changed

1. A Formal Adversarial ML Threat Taxonomy

The updated framework introduces a structured taxonomy of adversarial ML threats, drawing heavily on NIST’s earlier Special Publication on Adversarial Machine Learning (SP 1270). For security teams, this is the most immediately useful addition: it gives you a shared vocabulary to use with governance functions and a checklist to test your controls against.

Threat CategoryDescription
Evasion attacksTest-time perturbations causing misclassification
Poisoning attacksTraining data or model manipulation
Extraction attacksModel stealing via query access
Inference attacksMembership inference and data reconstruction
Backdoor/trojanTrigger-activated malicious behaviour
Prompt injectionInstruction override in LLM deployments

Each category includes recommended mitigations mapped to the GOVERN, MAP, MEASURE, and MANAGE functions. Teams that have been doing ML threat modelling informally now have a NIST-blessed taxonomy to reference in board presentations and vendor questionnaires.

2. Model Supply Chain Security (Finally a Dedicated Section)

This is long overdue. The 2.0 framework acknowledges what practitioners have known for a while: most deployed AI systems incorporate third-party components (pre-trained models, fine-tuned adapters, training datasets, ML libraries) each representing a potential attack surface that the original framework treated as an afterthought.

Key new requirements:

  • Provenance documentation: maintain records of the origin and lineage of all model components, including pre-trained base models, fine-tuning datasets, and significant framework dependencies. This is the AI equivalent of a software bill of materials.
  • Integrity verification: model artifacts should be verified against a cryptographic hash or signature prior to deployment. NIST explicitly recommends treating model weights as code artifacts subject to the same integrity controls. Not a suggestion that will delight data science teams, but a necessary one.
  • Third-party risk assessment: AI vendors and model providers should be assessed against minimum security criteria: secure development practices, incident disclosure procedures, adversarial robustness testing. This gives procurement teams something concrete to ask vendors.

3. AI-Specific Incident Response

The most operationally significant new section. NIST identifies categories of AI security incident that require distinct response procedures from conventional cyber incidents, and gets specific about what those procedures should look like.

Model compromise incidents (where the integrity of a deployed model is suspected). Recommended immediate actions:

  1. Take the model offline pending investigation
  2. Preserve model weights and configuration: do not overwrite with a new version until provenance is established
  3. Retrieve and preserve inference logs for the relevant period
  4. Initiate behavioural comparison against a known-clean baseline

Inference attack incidents: evidence of systematic model querying consistent with extraction or inversion attacks. Recommended actions include rate limiting and query anomaly detection while investigating the scope of extraction.

Data poisoning incidents: suspected manipulation of training data. Requires investigation of the full data pipeline from ingestion to training, with particular attention to any third-party data sources or annotation vendors.

The practical value here isn’t that these procedures are surprising (experienced practitioners will recognise them). It’s that they’re now in a document your CISO can reference when building AI IR runbooks and justifying headcount.

Honest Assessment: Where the Gaps Are

Despite the improvements, some things 2.0 still doesn’t adequately address.

Generative AI and LLM-specific risks remain relatively thin. The taxonomy includes prompt injection, but the framework’s treatment of agentic system risks, hallucination with security implications, and multi-model pipeline vulnerabilities is underdeveloped compared to its coverage of classical ML threats. If your risk surface is primarily LLM-based, you’ll need to supplement.

Enforcement and compliance: like its predecessor, AI RMF 2.0 is voluntary. No audit mechanism. No certification path. Organisations seeking regulatory compliance under the EU AI Act will need to map the framework to specific requirements. NIST has committed to supporting this with additional mapping documentation, and a crosswalk is already published.

Quantitative risk measurement: the MEASURE function remains largely qualitative. If you need specific metrics for adversarial robustness, model confidence calibration, or privacy leakage, the framework points in the right direction without giving you the numbers. You’ll need more technically specific guidance to fill that gap.

If You’re Already on AI RMF 1.0

Four practical steps for teams updating from the previous version:

  1. Map existing controls to the new adversarial ML taxonomy: identify which threat categories you have mitigations for and which have gaps. This exercise alone is worth the time.
  2. Prioritise model supply chain review: assess the provenance and integrity controls in place for all third-party model components in production. If you don’t know where your models came from, that’s the gap.
  3. Develop AI-specific incident response runbooks: the framework now provides sufficient structure to build IR procedures around. Use it.
  4. Engage AI vendors on their AI RMF alignment: use the supply chain section as the basis for vendor security questionnaires. Vendors who can’t answer basic provenance and robustness testing questions are a risk.

The full AI RMF 2.0 document, accompanying playbooks, and crosswalk to the EU AI Act are available on the NIST AI RMF website.

References

Frequently Asked Questions

What are the most significant new additions in NIST AI RMF 2.0?
The three most significant additions are a formal adversarial ML threat taxonomy covering evasion, poisoning, extraction, inference, backdoor, and prompt injection attacks; a dedicated model supply chain security section requiring provenance documentation, integrity verification, and third-party risk assessment; and AI-specific incident response procedures for model compromise, inference attack, and data poisoning incidents.
Does NIST AI RMF 2.0 create compliance obligations for US organisations?
No: like its predecessor, AI RMF 2.0 is a voluntary framework. It does not carry the force of regulation and provides no audit or certification mechanism. Organisations seeking regulatory compliance under the EU AI Act will need to separately map the framework to those specific requirements, which NIST has committed to supporting with additional crosswalk documentation.
What gaps remain in AI RMF 2.0 that security teams should be aware of?
The framework's treatment of generative AI and LLM-specific risks remains relatively thin, particularly around agentic system risks and prompt injection. The MEASURE function remains largely qualitative, lacking specific metrics for adversarial robustness or privacy leakage. The framework also lacks an enforcement or certification mechanism.