Skip to content
AI Security Wire

Published

- 5 min read

By

APT45 Used AI to Write a Working Zero-Day: What Google GTIG Found

img of APT45 Used AI to Write a Working Zero-Day: What Google GTIG Found

Google’s Threat Intelligence Group published research this week confirming what the security community has been anticipating and modeling for several years: a nation-state actor has used AI to write a working zero-day exploit, which was then deployed in an active intrusion campaign. The actor is APT45, a North Korean state-sponsored group with a long history of financially motivated and espionage-driven operations. The vulnerability was a two-factor authentication bypass in a widely deployed open-source web administration tool.

The confirmation is a threshold moment. Not because AI-assisted offensive tooling is new — it isn’t — but because exploit generation for previously unknown vulnerabilities was the capability that security researchers consistently placed furthest out on the AI risk timeline. This finding moves that timeline forward.

What APT45 Actually Did

GTIG’s research describes an operational methodology built around recursive prompt iteration. APT45 operators submitted thousands of prompts to an AI system, directing it to analyze known CVE patterns, identify analogous code constructs in the target application, generate candidate exploit code, and evaluate whether the generated payloads achieved the desired 2FA bypass. The process was automated enough to run at scale — not a single creative insight but a systematic search through the vulnerability space of the target codebase.

The resulting Python exploit bypassed 2FA on the target application without requiring credentials. GTIG identified the campaign on May 11, 2026, coordinated disclosure with the affected vendor, and a patch was released before mass exploitation could extend beyond the initial intrusion set.

What gave the AI origin away was the code itself. Experienced exploit developers write lean, functional code — minimal comments, shorthand variable names, pragmatic structure. The APT45 zero-day carried the signature of AI output: exhaustive inline comments narrating each operation, variable names so descriptive they read like documentation, and most distinctively, hallucinated CVSS scores embedded as comments that corresponded to no real CVE. The AI system had apparently been trained on CVE-annotated datasets and was inserting plausible-looking but entirely fabricated vulnerability metadata. Those artifacts allowed GTIG to assess with high confidence that AI tooling produced the exploit, not human developers working alone.

Why Exploit Generation Matters

There is a significant difference between AI generating phishing emails and AI generating zero-day exploits. Phishing generation lowers the skill floor for social engineering attacks — it makes a capability more accessible without fundamentally changing the threat model. Zero-day exploit generation potentially changes the threat model itself.

Finding a novel vulnerability in a mature, audited codebase and turning it into a reliable exploit traditionally required deep expertise and significant time investment. Those constraints served as a natural limiter on zero-day production: only well-resourced threat actors — primarily nation-states and top-tier criminal organizations — could sustain meaningful zero-day development pipelines. If AI can compress that work from weeks or months to hours or days, the constraint relaxes.

The June 2026 CVE surge analysis from FIRST.org, which projects 66,000 CVEs for the full year driven by AI vulnerability discovery on the defensive side, now has an offensive mirror. The same AI-accelerated analysis that is flooding vendors with bug reports is available to threat actors pointing those tools at production applications. The FIRST.org research is careful to distinguish between more bugs being found and more bugs being exploitable — but an AI system that can both find and exploit vulnerabilities narrows that distinction.

What the APT45 Campaign Targeted

The vulnerability class — 2FA bypass — is particularly notable given the current security landscape. Two-factor authentication is the primary control organizations have deployed to resist credential-based attacks. The volume of threat actor activity targeting 2FA implementations has increased consistently over the past three years, driven by the effectiveness of adversary-in-the-middle phishing kits like Evilginx and its derivatives that bypass 2FA at the session layer. A zero-day that bypasses 2FA at the application layer is a different attack surface and one that session-layer defenses do not address.

APT45’s operational focus has historically spanned espionage, cryptocurrency theft, and financial sector intrusion. Which of those objectives drove this campaign has not been disclosed. The affected vendor and specific application have not been publicly named, consistent with responsible disclosure practice while organizations apply the available patch.

What Changes for Defenders

The tactical picture for defenders remains familiar: patch promptly, reduce attack surface, detect lateral movement early. AI-generated exploits are still exploits — they target the same categories of vulnerability, trigger the same security tool signatures, and fail against the same controls that stop human-written exploits.

The strategic implication is timeline compression. The assumption that a disclosed vulnerability will take weeks to weaponize, or that an undisclosed vulnerability requires months of expert research to reach exploit-ready status, becomes weaker when threat actors have AI systems running recursive exploit generation at scale. Organizations that treat patching as a 30-to-90-day process and that have internet-facing open-source applications handling authentication should revisit both assumptions.

The secondary implication is forensic. The APT45 case was detectable in part because AI-generated code has identifiable stylistic signatures today. That will not remain true indefinitely as AI systems are trained on exploit code itself and as threat actors learn to clean AI artifacts from generated output.

References

Frequently Asked Questions

How did Google GTIG confirm the exploit was AI-generated?
The exploit code contained several forensic markers that security researchers associate with AI-generated output: overly explanatory inline comments narrating each step, hallucinated CVSS scores embedded as comments that did not correspond to any real CVE, and stylistic patterns — unusually consistent indentation, verbose variable naming — inconsistent with the lean, pragmatic style typical of experienced exploit developers. Combined with GTIG's intelligence on APT45 tooling and infrastructure, the assessment of AI involvement reached high confidence.
Is this the first time AI has been used in offensive security operations?
No — AI tools have been incorporated into offensive workflows for years, primarily for tasks like generating phishing content, scanning infrastructure, and automating recon. What makes this case significant is the target: zero-day vulnerability discovery and exploit generation, which previously required deep human expertise and time. Using AI to recursively analyze CVEs, validate proof-of-concept code, and generate working exploits for previously unknown flaws represents a qualitative step beyond prior known use cases. GTIG describes this as the first confirmed case where AI produced a functional zero-day exploit deployed in an active intrusion campaign.
What should defenders do in response to AI-accelerated exploit development?
The core defensive equation has not changed — patch promptly, prioritize by exploitation likelihood, and reduce dwell time for post-compromise detection. What AI-accelerated exploit development changes is the timeline. The window between vulnerability disclosure and weaponized exploit is already compressing; if threat actors can use AI to close the zero-day discovery gap further, the assumption that unpatched vulnerabilities provide meaningful protection for weeks or months becomes untenable. Defenders should treat authentication mechanisms — particularly 2FA implementations, which were the target class here — as a priority hardening surface, and review whether their 2FA implementations are open-source and internet-reachable.