Hackers Can Exploit Image Scaling In Gemini CLI, Google Assistant To Exfiltrate Sensitive Data

Hackers can weaponize hidden prompts revealed by downscaled images to trigger sensitive tool actions and achieve data exfiltration in Gemini CLI—and similar risks extend to Google Assistant and other production AI systems, according to new research by Trail of Bits.

By exploiting how AI services routinely apply image scaling, the researchers showed that a benign-looking upload can morph into malicious instructions only at the model’s input resolution.

Trail of Bits disclosed a practical image-scaling prompt injection that exfiltrates Google Calendar data via the Gemini CLI when paired with a Zapier MCP configuration that auto-approves tool calls.

Google News

The attack hinges on a default-like setup where the MCP server is configured with trust=True in settings.json, removing confirmation prompts for sensitive actions.

How it works

Many AI pipelines downscale images before inference, and interpolation can surface patterns invisible at full resolution—letting hidden text or instructions emerge only after resampling.

The team tailored payloads to common downscalers, nearest neighbor, bilinear, and bicubic, leveraging aliasing behavior and implementation quirks across libraries like Pillow, PyTorch, OpenCV, and TensorFlow.

The technique was validated against multiple Google Gemini surfaces and third-party apps, underscoring systemic exposure beyond a single client.

Vertex AI Studio with a Gemini backend.
Gemini’s web interface.
Gemini’s API via the llm CLI.
Google Assistant on Android.
Genspark.

There’s a dangerous mismatch between what users see and what models receive: UIs often display the original high-res image while the model ingests a downscaled version where the malicious payload appears.

In the Gemini CLI case, the hidden instructions triggered Zapier actions that emailed Calendar data without any user approval, demonstrating real-world data loss from a single image upload.

Trail of Bits fingerprinted downscaler types and implementations using diagnostic patterns (checkerboards, Moiré, slanted edges) to infer interpolation behavior and optimize payloads.

They showed how bicubic interpolation’s weighted 4×4 neighborhood enables crafting high-importance pixels so dark regions resolve into high-contrast instructions after downsampling.

Differences in anti-aliasing, alignment, and kernel phases across libraries significantly affect exploitability and necessitate per-system tuning.

To facilitate research and reproducibility, the team released Anamorpher, an open-source beta tool to generate and visualize downscale-triggered prompt injections for bicubic, bilinear, and nearest neighbor paths.

Anamorpher includes a frontend to compare implementations (OpenCV, PyTorch, TensorFlow, Pillow) and a modular backend to plug in custom resamplers.

Mitigations

The strongest recommendation is to avoid downscaling altogether and enforce upload dimension limits so the model sees exactly what the user sees.

If transformations are unavoidable, always preview the exact model-bound input and require explicit confirmation for sensitive tool calls—especially when text is detected within images backed by secure design patterns against prompt injection.

Image-scaling prompt injections convert ordinary images into stealth command carriers at inference time, enabling data exfiltration when combined with permissive agent tooling like trust=True.

With demonstrated impact across Gemini CLI, Google Assistant, and more, the exposure is broad—and closing it demands UX alignment, stricter tool-call gating, and defense-in-depth beyond superficial content filtering.

Find this Story Interesting! Follow us on LinkedIn and X to Get More Instant Updates.

Source link

How it works

Mitigations

About Cybernoz