What is a model distillation attack?

Model distillation is a legitimate technique where a smaller, cheaper model is trained on the outputs of a larger, more capable one to approximate its behavior at lower cost. When done without authorization, it becomes an extraction attack: the attacker systematically queries a commercial AI model to harvest training data for a competing model, effectively stealing capabilities that the AI company spent enormous resources developing.

How did Anthropic detect the Alibaba campaign?

Anthropic has not publicly disclosed its full detection methodology. The company tracked 28.8 million interactions attributed to approximately 25,000 fraudulent accounts over the April 22 to June 5 period. The scale and systematic nature of the queries, combined with behavioral signals that differ from typical user activity, are consistent with the kind of pattern analysis that AI providers use to identify coordinated extraction attempts.

What legal or regulatory consequences could follow?

Anthropic's letter to the Senate Banking Committee suggests the company is seeking legislative action rather than pursuing civil litigation alone. Model distillation without authorization potentially violates terms of service, computer fraud statutes, and trade secret law, though jurisdiction and enforcement against a Chinese firm are complex. The Senate context signals Anthropic wants export-control-style restrictions on AI capability extraction, not just contract remedies.

Anthropic Accuses Alibaba of Largest-Ever Claude Distillation Attack

Anthropic has formally accused Alibaba of running the largest model distillation attack the company has ever documented. In a letter to members of the Senate Banking Committee dated June 10, 2026, Anthropic alleged that Alibaba used approximately 25,000 fraudulent accounts to generate 28.8 million interactions with Claude over a 2.5-month window, with the apparent goal of harvesting data to train competing models.

The campaign ran from April 22 to June 5. Anthropic told senators Tim Scott and Elizabeth Warren that the attackers targeted Claude’s capabilities in software engineering and agentic reasoning specifically, those being the areas where Claude’s benchmarks have drawn the most competitive attention from Chinese AI developers.

What Distillation Attacks Actually Do

Distillation is, in normal use, a standard ML technique. A smaller model trains on the outputs of a larger one, learning to approximate the larger model’s behavior at a fraction of the compute cost. OpenAI, Anthropic, and Google all use variants of this internally.

When an outside party runs distillation without authorization, it becomes something different: a capability extraction campaign. The attacker doesn’t need to reverse-engineer the model weights or steal source code. They just need enough query-response pairs to train a competitive model. At 28.8 million interactions, Alibaba would have accumulated a substantial dataset covering a wide range of Claude’s behavior.

This is not a novel attack vector. Anthropic disclosed in February 2026 that DeepSeek, Moonshot, and Minimax had conducted similar campaigns. What distinguishes the latest accusation is scale: Anthropic is calling this the largest single distillation operation it has detected to date.

The Senate Letter and What Anthropic Wants

Anthropic’s decision to write to the Senate Banking Committee rather than pursue civil litigation immediately is telling. The company appears to be making a case for legislative intervention, not just seeking damages.

The Banking Committee’s jurisdiction includes financial sanctions and export controls, tools that could be used to restrict Chinese firms’ access to US AI services. Anthropic framing the Alibaba campaign in terms of national-scale AI competition, and specifically contacting the senators who oversee those policy mechanisms, suggests the company wants regulatory teeth applied to distillation attacks, not just better terms of service enforcement.

The broader context matters here. US export controls on AI chips and models have tightened significantly over the past two years. Restricting access to the most capable US AI APIs for Chinese firms would be a logical extension of that policy trajectory, and Anthropic’s disclosure gives policymakers concrete evidence of the threat.

Attribution Confidence

Anthropic’s letter attributes the campaign to Alibaba directly, which is a stronger claim than most AI companies make publicly. Attribution of this kind typically relies on account infrastructure analysis: payment methods, IP ranges, email patterns, and behavioral clustering across accounts.

Alibaba has not, as of publication, publicly responded to the accusation. The company operates multiple AI products that compete with Anthropic, including the Qwen model series, which has made rapid capability gains over the past 18 months.

Prior Incidents and the Detection Gap

The February 2026 disclosure about DeepSeek, Moonshot, and Minimax established that Anthropic is actively monitoring for distillation-pattern queries. This latest disclosure suggests the monitoring is working, but it also raises an uncomfortable question: if these campaigns run for weeks before detection, how many interactions is an attacker able to capture before being flagged?

The April 22 to June 5 window is 44 days. At 28.8 million interactions, that is roughly 655,000 queries per day across 25,000 accounts, or about 26 queries per account per day. That rate is unusual but not obviously suspicious for any individual account, which is presumably the point.

The sheer number of accounts required, and the coordination needed to manage them, implies this is not an opportunistic attack. It is a deliberate, resourced operation.

Anthropic Accuses Alibaba of Largest-Ever Claude Distillation Attack

What Distillation Attacks Actually Do

The Senate Letter and What Anthropic Wants

Attribution Confidence

Prior Incidents and the Detection Gap

References

Frequently Asked Questions