Published
- 5 min read
By Allan D - Editor, AI Security Wire
Incident Report: Adversarial Attack on AI Diagnostic System at NHS Trust
Incident Classification: Confirmed | Incident Type: Illustrative | Severity: Critical | Sector: Healthcare | Date Confirmed: May 2026
An NHS trust in the East of England has confirmed a security incident involving deliberate manipulation of medical images processed by an AI-assisted radiology diagnostic system. Systematic misclassification of chest X-ray images occurred over approximately six weeks. Clinical review is ongoing to assess whether any diagnostic decisions were adversely affected.
Patient safety note: The trust has confirmed that all cases flagged by the affected system were also reviewed by a radiologist. Clinical review is ongoing to determine whether any diagnostic decisions were adversely affected.
Incident Summary
| Field | Detail |
|---|---|
| Incident type | Adversarial perturbation of medical images |
| Affected system | AI-assisted chest X-ray screening platform (vendor name withheld) |
| Duration | Approximately 6 weeks |
| Images affected | Estimated 340–400 images |
| Detection method | Statistical anomaly detected in model confidence score distribution |
| Notified to | ICO, NHS England, MHRA |
How the Attack Unfolded
Week −2: The attacker gains access to the PACS integration layer, the software that transfers images from the Picture Archiving and Communication System to the AI platform’s input queue. Initial access vector is under investigation, believed to involve a compromised service account with access to the DICOM image transfer agent.
Week 0: First adversarially perturbed images enter the screening pipeline. The modifications are imperceptible to the human eye and do not affect the underlying image as reviewed by radiologists.
Week 4: The AI vendor’s monitoring system picks up an anomalous distribution shift in confidence scores for images from this trust. Normally scores cluster near 0.0 or 1.0. They’re now showing an unusual mid-range distribution.
Week 6, Day 1: Vendor notifies the trust. Investigation reveals that affected images contain systematic pixel-level perturbations consistent with adversarial example generation.
Week 6, Day 3: The AI screening system goes offline. ICO, NHS England, and MHRA are notified. Clinical review initiated.
What Was Actually Done to the Images
The perturbations are consistent with a universal adversarial perturbation (UAP): a single noise pattern computed once that, when added to many different images, causes systematic misclassification across all of them. This is significantly more operationally efficient for an attacker than computing per-image adversarial examples. Generate it once, apply it uniformly to whatever enters the pipeline.
The perturbation had three notable properties:
- Imperceptible: maximum pixel value change ≤ 4/255 (L∞ norm constraint). Radiologists reviewing the original images saw nothing anomalous.
- Universal: same pattern applied across all affected images, not customised per image
- Targeted toward false negatives: designed to produce high-confidence negative predictions on images that would otherwise score as positive findings
That targeting decision is genuinely alarming. It suggests the attacker understood clinical workflow: false positives generate radiologist follow-up and are visible. False negatives cause cases to be deprioritised quietly. The attacker was aiming for the failure mode that’s hardest to detect. The outcome was limited because this trust’s protocol requires radiologist review of all cases, a safety net that not every trust has.
Access Path and Why Nothing Flagged It
The attacker reached the DICOM integration layer between the PACS and the AI platform. That layer had:
- A service account with read/write access to the image queue directory
- No integrity verification of images in transit
- No logging of image modification events
The perturbation was applied in the transfer process. Neither the PACS logs nor the AI platform’s logs showed evidence of manipulation at the point of detection. The only signal was the confidence score distribution shift: a reactive control that took four weeks to surface.
Root Causes
- No image integrity verification: the AI platform accepted images without cryptographic integrity checks from the source PACS
- Overprivileged integration account: write access to a directory that should have been write-only from the PACS side and read-only on the AI side; the integration account had both
- No proactive input monitoring: the anomaly detection existed, but as a reactive downstream control on model outputs, not a proactive pixel-level check on inputs
- Sparse audit logging: the integration layer had insufficient logging to reconstruct the access path forensically
What Should Have Been in Place
For Healthcare AI Deployments
Image integrity verification: Cryptographic signing at the point of PACS acquisition, with signature verification required before any AI processing. This is the single control that would have prevented the attack outright: a modified image fails the signature check.
Integration layer access controls: Separate credentials for write (PACS → queue) and read (AI platform ← queue) operations. Strict ACLs on image staging directories. File-level audit logging on image queues. The integration layer should not be a single service account with permission to both write and read from the same directory.
Input distribution monitoring: Statistical monitoring on AI system inputs, not just outputs. Key metrics worth running continuously:
- Pixel value distribution per image batch
- Frequency spectrum analysis (adversarial perturbations often show anomalous high-frequency components)
- Model confidence score distribution over time
Adversarial robustness testing before clinical deployment: Standard attack frameworks (FGSM, PGD, universal perturbations) against the model under evaluation. If a deployed clinical AI system hasn’t been tested against UAP attacks, you don’t know your threshold for misclassification.
For Vendors Selling Into Clinical Environments
Input integrity verification shouldn’t be an enterprise add-on or a configuration option. It should ship as a default. The same is true of monitoring dashboards for input statistics: vendors shouldn’t be the only party who can detect anomalies in their own system’s inputs.
Regulatory Situation
The MHRA is investigating under UK Medical Devices Regulations: the AI system is classified as a Class IIa medical device. The ICO is covering the data breach aspects. NHS England’s AI Assurance Framework, published in 2025, already includes adversarial robustness testing requirements for AI in clinical pathways. This incident will inform updated guidance, almost certainly in the direction of making those requirements more prescriptive.
The trust has committed to publishing a full incident report after clinical review and regulatory investigations conclude.
References
Frequently Asked Questions
- What is a universal adversarial perturbation (UAP) and how was it used in this attack?
- A universal adversarial perturbation is a single noise pattern that, when added to many different inputs, causes systematic misclassification across all of them. Unlike per-image adversarial examples, a UAP only needs to be computed once and applied uniformly. In this incident, the attacker applied a UAP to chest X-ray images in the DICOM integration layer, causing the AI diagnostic system to produce high-confidence false negative results across hundreds of images.
- How did the attacker gain access to modify medical images without detection?
- The attacker compromised a service account with read/write access to the DICOM image staging directory in the integration layer between the PACS and the AI platform. Because the integration layer lacked granular audit logging and the AI platform had no image integrity verification, the perturbations were not detectable in any system logs at the point of initial investigation.
- What controls would have prevented or detected this adversarial attack on the medical AI system?
- Cryptographic signing of images at the point of PACS acquisition with signature verification before AI processing would have prevented the attack. Separating write credentials (PACS to queue) from read credentials (AI platform from queue), applying file-level audit logging to image queues, and deploying statistical input monitoring for pixel distribution anomalies would have dramatically reduced either the attack surface or the detection time.