Skip to content
AI Security Wire

Published

- 3 min read

By

Critical RCE in Popular ML Model Serving Framework: CVE-2026-24817

img of Critical RCE in Popular ML Model Serving Framework: CVE-2026-24817

Pickle deserialization vulnerabilities in ML infrastructure have been a known-bad pattern since at least 2021. Hugging Face has warned against loading untrusted pickle files for years. And yet, CVE-2026-24817 is a 9.8 CVSS critical in a widely-deployed model serving framework, root cause: unsafe pickle deserialization, default unauthenticated endpoint, active exploitation in the wild.

Patch. Then read the rest.

Vulnerability Details

FieldDetail
CVECVE-2026-24817
CVSS Score9.8 (Critical)
Attack VectorNetwork
AuthenticationNone required
User InteractionNone
Affected ComponentModel loading endpoint
Exploit AvailableYes: public PoC

Root Cause

The framework’s model loading endpoint accepts serialised model files in Python pickle format without validation or sandboxing. Pickle deserialization is inherently unsafe: a crafted pickle payload can invoke arbitrary Python code at deserialization time via __reduce__. No additional exploitation required. Whatever the serving process has access to, the attacker now has access to.

In default configurations, that process is often running as root or a privileged service account on a cloud instance with port 8080 exposed to the internet.

The ML community has known pickle is dangerous for years. The advice has been consistent and clear. Frameworks still ship with it as default. This is the outcome.

Exploitation

A public PoC has been disclosed. Exploitation requires the ability to submit a model file to the serving endpoint (unauthenticated in default config).

Observed attack chain:

  1. Attacker sends POST to /api/v1/models/load with a crafted pickle payload
  2. Framework deserializes the payload during model registration
  3. Attacker’s code executes in the serving process
  4. Attacker establishes reverse shell or drops persistent backdoor

Multiple threat intelligence vendors have confirmed active exploitation, primarily targeting misconfigured cloud instances with the model serving port exposed to the internet. If you have this endpoint internet-facing, assume it’s been scanned. Check your process trees.

Affected Versions

Versions prior to the patched release are affected. Consult the vendor advisory: patch notes reference “unsafe pickle loading in model registry endpoint.” Check your version.

Mitigations

Immediate actions:

  1. Apply the vendor patch: the fix replaces pickle deserialization with safetensors for model loading and adds authentication to the model registry endpoint.
  2. Restrict network access: the model serving port should not be internet-facing without authentication. Full stop. This should have been in your initial deployment config.
  3. Audit model sources: treat model files as executable code, because that’s what they are. Only load from trusted, verified sources with integrity verification.
  4. Monitor for anomalous process spawning: child processes from the ML serving process are the primary post-exploitation signal.

Detection (Sigma):

   title: Suspicious Child Process from ML Model Serving Framework
logsource:
  product: linux
  category: process_creation
detection:
  selection:
    ParentImage|endswith: ['/python3', '/uvicorn', '/gunicorn']
    Image|endswith: ['/bash', '/sh', '/curl', '/wget', '/nc']
  condition: selection
level: high

This rule catches the most common post-exploitation patterns: shell spawning and network tool execution from the serving process. It will produce some false positives in development environments where Python processes legitimately spawn shells. In production serving infrastructure, any hit should be treated as high priority.

References

Related Posts

There are no related posts yet. 😢

Frequently Asked Questions

Why is Python's pickle format dangerous for loading ML model files?
Python's pickle deserialization can invoke arbitrary Python code at deserialization time via the __reduce__ method. A crafted pickle payload gives an attacker code execution in the context of the deserializing process (typically running as root or a privileged service account) without any additional exploitation steps.
What is the recommended replacement for pickle when serializing ML models?
The safetensors format, developed by Hugging Face, is the recommended replacement for pickle in model serialization. It stores only tensor data in a safe, bounded format with no code execution capability. The vendor patch for CVE-2026-24817 replaces pickle loading with safetensors.
How can defenders detect exploitation of CVE-2026-24817 on their infrastructure?
Monitor for child processes spawned by the ML serving process: particularly shells (bash, sh) or network tools (curl, wget, nc). A Sigma detection rule targeting child processes of Python, uvicorn, or gunicorn executables producing shell or network utility processes provides high-fidelity alerting for post-exploitation activity.