A comprehensive guide to AI prompt injection attack vectors — direct injection, indirect injection via tool outputs, and multi-turn manipulation — with observed real-world attacks and a layered defensive stack for production LLM deployments.
Tracking AI threats, vulnerabilities, and defensive strategies for security professionals.
A comprehensive guide to AI prompt injection attack vectors — direct injection, indirect injection via tool outputs, and multi-turn manipulation — with observed real-world attacks and a layered defensive stack for production LLM deployments.
A practical breakdown of the OWASP Top 10 for Large Language Model Applications (v2.0) — covering each vulnerability class, real-world observed attacks, and the defensive controls that address them for teams building and deploying LLM-powered products.
Meta's High Touch Support AI chatbot contained a confused deputy vulnerability that allowed attackers to hijack Instagram accounts by supplying their own email address during recovery requests. 20,225 accounts were compromised over 45 days before the flaw was patched.
On June 5, 2026, a self-replicating supply chain worm compromised 73 Microsoft GitHub repositories across Azure, Azure-Samples, Microsoft, and MicrosoftDocs — targeting developers through AI coding tool configuration files. GitHub contained the attack in 105 seconds. The attacker had been active since mid-May.
Two vulnerability research disclosures from Adversa AI broke the same month and hit the same targets. SymJack uses symlink hijacking to plant malicious MCP servers across six AI coding agents. TrustFall bypasses trust dialogs entirely. Together, they make every developer machine a potential supply chain attack surface.
CISA added CVE-2026-42271 to its Known Exploited Vulnerabilities catalog after Horizon3.ai chained the LiteLLM MCP command injection flaw with a Starlette authentication bypass to achieve unauthenticated RCE on AI gateway deployments. Federal agencies have until June 22 to patch.
A flawed permission check in Anthropic's Claude Code GitHub Action allowed unauthenticated attackers to use prompt injection via a single crafted GitHub issue to steal CI/CD secrets and push malicious code to any downstream repository. Patched in v1.0.94.
SafeBreach Labs published research showing a prompt injection attack that hides malicious commands inside ordinary WhatsApp, Slack, or SMS notifications. Gemini treats the hostile text as trusted instruction, enabling device control, message forgery, and persistent memory poisoning.
Two vulnerability research disclosures from Adversa AI broke the same month and hit the same targets. SymJack uses symlink hijacking to plant malicious MCP servers across six AI coding agents. TrustFall bypasses trust dialogs entirely. Together, they make every developer machine a potential supply chain attack surface.
Security researchers have identified hundreds of backdoored and malware-laced models in public AI registries. Most organisations pulling models from Hugging Face and similar platforms have no controls in place to detect them.
Retrieval-Augmented Generation has become the default architecture for enterprise AI deployments. It has also created a new class of attack surface that most security teams have not assessed. Document poisoning, indirect prompt injection via retrieved content, and access control gaps are showing up in production systems right now.
A newly attributed state-sponsored threat actor is systematically targeting AI development infrastructure to poison training datasets and embed persistent backdoors in commercially deployed models.
PhantomSynth is a financially motivated threat actor that has industrialised the use of LLMs to generate hyper-personalised spear phishing lures at scale, dramatically lowering the cost of targeted social engineering campaigns.
GhostCircuit is a ransomware-as-a-service operation that has integrated LLM-based tooling into its post-compromise reconnaissance phase, dramatically accelerating the time from initial access to ransomware deployment.
depthfirst ran an autonomous security agent across FFmpeg's 1.5 million lines of C code and produced 21 confirmed zero-days with reproducible proof-of-concept inputs, several dormant for over 20 years. Total inference cost for the full run: approximately $1,000.
Automated AI vulnerability research found 10,000 critical flaws in a month. Mandiant reports 28.3% of CVEs exploited within 24 hours of disclosure. The implications for defenders are uncomfortable.
Research demonstrates that LLMs with large context windows can be reliably jailbroken by embedding hundreds of fictitious dialogues before the target request, a technique that scales with context length and bypasses standard safety training.
A comprehensive guide to AI prompt injection attack vectors — direct injection, indirect injection via tool outputs, and multi-turn manipulation — with observed real-world attacks and a layered defensive stack for production LLM deployments.
A practical breakdown of the OWASP Top 10 for Large Language Model Applications (v2.0) — covering each vulnerability class, real-world observed attacks, and the defensive controls that address them for teams building and deploying LLM-powered products.
The NSA's Artificial Intelligence Security Center released a Cybersecurity Information Sheet on Model Context Protocol security in May 2026, flagging serious gaps in authentication, trust boundaries, and access control that most enterprise agentic deployments have not addressed.
Meta's High Touch Support AI chatbot contained a confused deputy vulnerability that allowed attackers to hijack Instagram accounts by supplying their own email address during recovery requests. 20,225 accounts were compromised over 45 days before the flaw was patched.
On June 5, 2026, a self-replicating supply chain worm compromised 73 Microsoft GitHub repositories across Azure, Azure-Samples, Microsoft, and MicrosoftDocs — targeting developers through AI coding tool configuration files. GitHub contained the attack in 105 seconds. The attacker had been active since mid-May.
An NHS trust has confirmed a security incident in which adversarial perturbations were applied to medical images prior to processing by an AI-assisted diagnostic system, causing systematic misclassification in a radiology screening programme.