4 min read
Research Recent research demonstrates that vision-language models including GPT-4V, Gemini Pro Vision, and open-source alternatives are highly susceptible to adversarial image perturbations, with attacks transferring across models at rates significantly higher than classical vision model attacks.