Research

Latest Posts News Brief Vulnerabilities Threat Actors Research Defensive Techniques Incident Reports

May 28, 2026 4 min read

Model Stealing via Black-Box API Access: Methods and Defences

A survey of query-efficient model extraction attacks against commercial LLM APIs — how adversaries can reconstruct a functional shadow model using only input-output pairs, the commercial and security risks this creates, and the defences providers are deploying.

img of Jailbreaking Multimodal Models via Image-Encoded Instructions

May 27, 2026 4 min read

Research

Jailbreaking Multimodal Models via Image-Encoded Instructions

Researchers demonstrate that safety-aligned multimodal LLMs can be reliably jailbroken by encoding adversarial instructions as text within images, bypassing text-layer safety filters that do not process image content through the same moderation pipeline.

May 25, 2026 4 min read

Research

Adversarial Attacks on Vision-Language Models: New Research

Recent research demonstrates that vision-language models including GPT-4V, Gemini Pro Vision, and open-source alternatives are highly susceptible to adversarial image perturbations, with attacks transferring across models at rates significantly higher than classical vision model attacks.

img of Many-Shot Jailbreaking: Long-Context Windows as an Attack Surface

May 22, 2026 4 min read

Research

Many-Shot Jailbreaking: Long-Context Windows as an Attack Surface

Research demonstrates that LLMs with large context windows can be reliably jailbroken by embedding hundreds of fictitious dialogues before the target request — a technique that scales with context length and bypasses standard safety training.

img of Sleeper Agents in Fine-Tuned LLMs: Backdoors That Survive Alignment

May 22, 2026 5 min read

Research

Sleeper Agents in Fine-Tuned LLMs: Backdoors That Survive Alignment

New research demonstrates that backdoor behaviours introduced into LLMs during fine-tuning can persist through subsequent safety alignment procedures, including RLHF and adversarial training, posing significant supply chain risks.