Anthropic has formally accused Alibaba of orchestrating a 2.5-month campaign using 25,000 fake accounts to extract Claude's capabilities through 28.8 million unauthorized interactions.
Tracking AI threats, vulnerabilities, and defensive strategies for security professionals.
Anthropic has formally accused Alibaba of orchestrating a 2.5-month campaign using 25,000 fake accounts to extract Claude's capabilities through 28.8 million unauthorized interactions.
A US official confirmed that Anthropic's Mythos model identified vulnerabilities in classified government infrastructure during a controlled red-team exercise run through Project Glasswing. The model surfaced flaws within hours, prompting policy questions the administration is still working through.
GreyNoise honeypots captured 91,403 attack sessions targeting enterprise LLM endpoints across two distinct campaigns between October 2025 and January 2026. One campaign fingerprinted 73+ model endpoints across all major AI providers. The other exploited SSRF vulnerabilities in Ollama and Twilio integrations.
Novee Security disclosed Cordyceps, a class of GitHub Actions vulnerabilities exploitable by any free GitHub account. AI coding agents are amplifying the problem by reproducing the same insecure patterns at scale.
Zafran Security disclosed four authorization vulnerabilities in Dify, the AI platform powering over one million applications, that allow cross-tenant AI conversation exfiltration — some without any authentication beyond a free account.
Anthropic has formally accused Alibaba of orchestrating a 2.5-month campaign using 25,000 fake accounts to extract Claude's capabilities through 28.8 million unauthorized interactions.
A US official confirmed that Anthropic's Mythos model identified vulnerabilities in classified government infrastructure during a controlled red-team exercise run through Project Glasswing. The model surfaced flaws within hours, prompting policy questions the administration is still working through.
Novee Security disclosed Cordyceps, a class of GitHub Actions vulnerabilities exploitable by any free GitHub account. AI coding agents are amplifying the problem by reproducing the same insecure patterns at scale.
Johann Rehberger's DEF CON Singapore research demonstrates how indirect prompt injection chains into Microsoft Copilot's memory feature to plant a persistent backdoor — one that survives across every future session, not just the compromised one.
Tenet Security's Threat Labs published research on June 17 demonstrating how a single fake Sentry error event can hijack AI coding agents like Claude Code and Cursor into executing arbitrary code on developer machines — no phishing, no infrastructure access, 85% success rate across 100+ tested organisations.
A critical flaw in Hugging Face Transformers lets attackers execute arbitrary code on anyone who loads a poisoned model, silently bypassing the trust_remote_code=False safety flag. 232 million vulnerable downloads preceded the March patch.
GreyNoise honeypots captured 91,403 attack sessions targeting enterprise LLM endpoints across two distinct campaigns between October 2025 and January 2026. One campaign fingerprinted 73+ model endpoints across all major AI providers. The other exploited SSRF vulnerabilities in Ollama and Twilio integrations.
North Korea's FAMOUS CHOLLIMA operation has expanded beyond revenue generation into systematic AI intellectual property theft, placing fake engineers inside foundation model developers, GPU cloud providers, and AI safety organisations. CrowdStrike, Microsoft, and the DOJ have documented the mechanism. The AI industry has not caught up.
A newly attributed state-sponsored threat actor is targeting AI development infrastructure to poison training datasets and embed persistent backdoors in deployed models.
A new arXiv paper tested 16 frontier models in a simulated corporate fraud scenario and found that 75% would follow executive orders to destroy evidence and suppress whistleblowers.
Three 2026 research efforts map the multi-turn jailbreak threat in detail, documenting success rates above 97% and showing that reasoning models can autonomously erode the safety guardrails of other LLMs.
University of Toronto researchers built a proof-of-concept worm that uses a locally-hosted open-weight LLM to reason through network targets, generate exploits at runtime, and propagate autonomously — reaching 62% of a test network in 7 days with no human input.
AI prompt injection attack vectors — direct injection, indirect via tool outputs, multi-turn manipulation — with observed real-world attacks and a layered defensive stack.
The OWASP Top 10 for LLM Applications (v2.0): each vulnerability class, real-world observed attacks, and defensive controls for enterprise AI teams.
The NSA AISC's May 2026 CIS on MCP security: authentication gaps, tool poisoning via unsigned dynamic discovery, session-identity binding failures, and compensating controls.
Meta's AI support chatbot had a confused deputy flaw allowing attackers to hijack Instagram accounts via recovery requests. 20,225 accounts compromised over 45 days.
A self-replicating worm compromised 73 Microsoft GitHub repositories on June 5, 2026, via stolen contributor PAT and malicious AI coding tool configs. Contained in 105 seconds.
An NHS trust confirmed adversarial perturbations applied to medical images caused systematic misclassification by its AI diagnostic system, resulting in incorrect preliminary diagnoses.