Mix

[tl;dr sec] #330 – AWS Pathfinding Labs, Running Codex Safely at OpenAI, Glasswing Updates


I hope you’ve been doing well!

Ain’t No Mountain High Enough

To keep me from sending to you bae.

Literally as I was starting to write this intro, my home Internet went out. After a moment I realized I had gotten a text a few days ago- scheduled maintenance with my Internet provider  

So now I’m finishing this issue via hot spotting with my phone.

I’ve wondered sometimes what I’d do if there was some sort of force majeure world or personal event that put my ability to finish the newsletter in jeopardy.

We cut to- *Movie trailer voice* In a world, where there’s too much security news…

*Inception bong* One terminally online hacker fights the info deluge for the people…

But today… *insert plot device like aliens arriving, natural disasters, Sharknado, etc.*

It could be any of these, I’ve got range.

I’ve offered this CTA to Netflix several times over 6 years, and heard back once. Still working on manifesting it  

No custom malware. No zero-days. Just your own admin tools, cloud APIs, and trusted processes repurposed without triggering a single alert. Varonis Threat Labs’ 2026 Attacker’s Playbook maps the full attack chain with real-world case studies and exposes how trust gets weaponized by threats at every stage. Discover the tactics. Close the gaps.

Varonis has been sharing some great security research recently. And I’m curious about these advanced tactics like “Cookie-Bite” and “EchoLeak”  

AppSec

falcosecurity/prempti
Tool by Falco that brings Falco to AI coding agents. Prempti intercepts every tool call (shell commands, file writes and reads, web fetches, MCP calls) at the agent’s hook API before it runs and produces allow/deny/ask verdicts from customizable Falco YAML rules, with an LLM-friendly explanation fed back to the agent on denials so it can adapt. Because interception happens at the hook level rather than the kernel, rules see what the agent declares but not the runtime behavior of compiled binaries or the side effects MCP servers later produce, so Prempti is positioned as a cooperative policy layer to use alongside OS-level containment rather than as a replacement for it.

Running Codex safely at OpenAI
OpenAI walks through how they deploy Codex internally, with security controls including sandboxed execution environments, approval workflows for high-risk actions, and an auto-review subagent that automatically approves low-risk operations to reduce friction. They enforce network policies that allowlist expected destinations and require approval for unfamiliar domains, manage authentication through OS keyrings pinned to their ChatGPT enterprise workspace, and use macOS managed preferences with admin-enforced requirements files to maintain consistent security baselines.

Codex exports OpenTelemetry logs containing user prompts, tool approvals, execution results, and network policy decisions, which OpenAI feeds into an AI-powered security triage agent that correlates endpoint alerts with agent intent to distinguish between legitimate behavior and genuine security incidents.

Auto-review mode is neat, I find Codex almost never prompts me in normal usage. Also, the Codex logs → security triage agent is very interesting, I’m curious to know more. Could you detect when an agent is going off the rails, or prompt injected and doing some C2 behavior? Or detecting an insider threat type situation? Lots of applications  

Automating Security Operations with AI: Triaging Renovate PRs
Marco Lancini writes up how he combined Renovate with Claude Code Routines to automate the review of dependency update PRs, using a custom Claude skill that posts a structured upgrade risk matrix back to each PR. The skill detects stack type (e.g. Python, JavaScript) from the PR title and classifies bumps as High, Medium, or Low based on semver plus how the package actually behaves (e.g. has a history of breaking changes), not just version distance. From there it greps source code for actual imports to flag dead dependencies, queries Context7 for breaking changes, and scans for deprecated config patterns like TypeScript’s baseUrl or Next.js’s middleware.ts. The cloud routine fires on each new [RENOVATE]-prefixed PR (Renovate runs monthly), runs the skill read-only without approval prompts, and posts the risk matrix via gh pr comment. A 14-day minimumReleaseAge filter sits in front of the whole pipeline to block supply-chain attacks.

Excellent example of automating a toil-heavy workflow: reviewing package updates. Marco kindly released the full Skill prompt he uses, which is thorough and handles a number of edge cases, and is definitely worth reviewing  

Most phishing simulations are still templated emails. Adaptive runs hyperrealistic multi-channel simulations including AI-generated voice calls, OSINT-based spearphishing tailored to each target, and automated phishing programs that run in the background without manual lift. The result: measurable reductions in click rates, stronger security behaviors over time, and a workforce that’s actually prepared for the threats hitting their inbox today. Rated 4.9/5 on G2 and Gartner.

AI-powered spearphishing and deepfakes are pretty worrying to be honest, they’re getting quite good. I’m glad people are working on this  

Cloud Security

Many dynamic analysis tools have been generating curl requests, PoC scripts, etc. that reproduce findings for a decade+, long before LLMs (shout-out Burp Suite, my BFF during my consulting days). And of course fuzzers do this by construction. It seems like auto-generating PoCs is or will be table stakes for any security tool that finds vulnerabilities, which honestly is kind of a cool world to be living in.

Pathfinding Labs: Deploy, test, and learn from 100+ intentionally vulnerable AWS environments
Datadog’s Seth Art introduces Pathfinding Labs, a collection of 100+ intentionally vulnerable AWS environments deployable via Terraform for practicing cloud attack paths and validating detections. The project ships a Go CLI tool (plabs) for deployment and a web catalog at pathfinding.cloud/labs with per-lab documentation. Scenarios cover self-escalation (a role granting itself admin via PutRolePolicy), one-hop and multi-hop privilege escalation chains, CSPM misconfigurations and toxic combos like a public Lambda with an admin role, and cross-account paths from dev or ops into prod. Each lab includes a demo_attack.sh script that walks the exploitation chain step-by-step, with cleanup scripts to revert artifacts after testing.

Love all the OSS tools and labs that Seth and Datadog put out. Awesome!

Paved With Intent: ROADtools and Nation-State Tactics in the Cloud
Palo Alto Networks’ Bill Batchelor and Eyal Rafian give an overview ROADtools, an open-source Python framework that nation-state actors like Cloaked Ursa (APT29) and Curious Serpens (APT33) have weaponized for cloud attacks, including how ROADtools evades detection and how these threat actors misuse it. The post discusses ROADtools’ roadrecon module for Entra ID enumeration via the Microsoft Graph API, the roadtx module for token manipulation, device registration, and MFA bypass, and categorize functionality in MITRE ATT&CK. They conclude with preventive controls to limit token misuse, and Cortex XQL detection queries.

In security, you either die young or live long enough to see threat actors using your tools for ill.

Supply Chain

We hardened zizmor’s GitHub Actions static analyzer
Trail of Bits’s Alexis Challande writes up a three-month collaboration with the zizmor maintainers to bring its YAML anchor support up to full coverage, prompted by the March 2026 supply-chain attack where attackers exploited a pull_request_target misconfiguration in aquasecurity/trivy-action to backdoor LiteLLM. Zizmor’s anchor support had been best-effort since GitHub Actions added native YAML anchors in September 2025. The team fixed parsing bugs that caused crashes and wrong-location findings, surfaced deserialization edge cases that broke zizmor on otherwise valid workflows, and aligned zizmor’s expression evaluator with GitHub’s Known Answer Tests, validating the work against a corpus of 41,253 workflows from 6,612 high-value open-source repositories.

The (In)security Landscape of AI-Powered GitHub Actions (Part 2/2)
Wiz’s Shay Berkovich found vulnerabilities in AI-powered GitHub Actions from OpenAI, Anthropic, and Google affecting repositories with 200,000+ combined stars. Among them, openai/codex-action and anthropics/claude-code-action rely on syntactical permission checks that let attackers impersonate trusted apps when allow-bots is enabled (or if a name is available to be registered, a “Dangling GitHub Apps” attack). Dependabot Deputy Confusion Injection has attackers issue @dependabot commands so dependabot appears as the github.actor on a PR, slipping past allow-lists.

Shay describes how some common authentication Actions create sensitive local secret files at runtime (e.g. GCP service account keys or others that often grant infrastructure-level access). Verbose modes in claude-code-action and run-gemini-cli leak these files via workflow logs even when the model refuses direct exfiltration.

AI-powered GitHub Actions have inherent design risks. Every reviewed Action interpolates untrusted user content into prompts, and those with MCP or tool access amplify the blast radius through file writes, shell execution, and git operations.

Red Team

microsoft/RAMPART
By Microsoft: RAMPART (Risk Assessment & Measurement Platform for Agentic Red Teaming) is a pytest-native framework for safety and security testing of agentic AI applications that enables developers to write structured tests, with evaluation-driven assertions checking agent behavior against each scenario.

Introducing TailscaleHound: Mapping Tailscale Attack Paths in BloodHound
SpecterOps’ Andrew Gomez and Andrew Luke release TailscaleHound, a BloodHound OpenGraph collector that maps Tailscale environments as queryable attack paths. It models users, devices, groups, tags, ACLs, grants, SSH rules, routes, app connectors, keys, and hybrid Azure identity links under the TS namespace.

Collection runs through the Tailscale API with read-only OAuth credentials, with optional tailcontrol cookie enrichment. Without API access, local collection works from tailscale status --json output, optionally enriched with an Access Policy file. From there, saved Cypher queries answer who can reach a given device, who can SSH as root, which subnet routes expose internal CIDRs, and which Azure users inherit Tailscale access through TS_AZUserSyncedToUser bridge edges.

Red teamers can maps paths from compromised identities into sensitive devices, useful exit nodes, and Azure-inherited Tailscale access. Defenders can pull from the same graph to flag overbroad ACL sources, stale groups and keys, sensitive routes exposed to broad groups, and SSH rules that hand out root or admin.

AI + Security

Introducing KeyLedger: Because You Probably Don’t Know How Many AI Keys Your Org Has
Riptides’ Balint Molnar open-sources KeyLedger, a Go TUI for inventorying AI provider API keys across OpenAI, Anthropic, Google Cloud Vertex AI, and AWS Bedrock through their admin APIs. KeyLedger normalizes each provider’s different organizational structure into a single table, with automatic health scoring flagging stale, idle, and never-used keys. SQLite snapshots let you diff inventories between runs to track new keys, revocations, and status changes over time. The TUI handles interactive exploration, and watch mode polls providers continuously and runs as a Docker container for long-running deployments.

Attacking Production Apps Without Jailbreaking the Model: Scope Manipulation with scopeshift
OFFENSAI’s Eduard Agavriloae releases scopeshift, an open-source tool demonstrating how AI coding agents can be tricked into attacking production targets while believing they’re testing localhost, bypassing jailbreaking entirely through network-layer deception rather than adversarial prompting. Scopeshift works as a reverse proxy on 127.0.0.1 that rewrites responses (stripping CDN headers, rewriting URLs, replacing titles with “Dev Build — Local”) and provides a deceptive MCP server that always returns “in scope” authorizations.

Testing showed that without safety prompts, Claude Opus 4.7 voluntarily called the MCP oracle and sent seven SQL injection payloads to the real OFFENSAI website, but a one paragraph safety prompt caused the model to refuse after recognizing that in-band signals (MCP responses, DNS, TLS, page content) cannot validate themselves.

This post does a great job highlighting something I’ve been thinking about: how can an AI model (or the labs creating them) know that the user is doing authorized testing? Especially when the model is operating in an environment totally controlled by the user. It feels like the same client-side security lessons we’ve learned from browsers and mobile apps. I’m not sure how this can be solved, which is concerning given the capability improvements of models  

Project Glasswing: An initial update
Anthropic releases Project Glasswing’s initial update, reporting that ~50 partners have used Claude Mythos Preview to find over 10,000* high- or critical-severity vulnerabilities in one month. Cloudflare found 2,000 bugs (400 high/critical); Mozilla fixed 271 in Firefox 150.

Anthropic separately scanned 1,000+ open-source projects, surfacing 6,202 estimated high/critical vulnerabilities. Of 1,752 already triaged by six independent security firms, 90.6% were valid true positives and 62.4% confirmed high/critical, including a wolfSSL certificate forgery exploit. The bottleneck has shifted from finding vulnerabilities to patching them, with maintainers taking an average of two weeks per high/critical bug,

Anthropic released Claude Security in public beta for Enterprise customers (already used with Claude Opus 4.7 to patch 2,100+ vulnerabilities), open-sourced the scanning harness, threat model builder, and skills its Glasswing partners used, and launched a Cyber Verification Program that lets security professionals use Claude for vulnerability research, penetration testing, and red-teaming without certain safeguards.

“Over 10,000 high- or critical-severity vulnerabilities*“ is a headline-y opening stat, but later on (based on my read) it seems like that’s the “claimed to be found and rated by Mythos number,” not the human triaged ground truth number. To be fair, that takes a huge amount of work and time. Cloudflare said, “a false positive rate better than human testers.” Which is… what rate?  

I think it’s awesome that Anthropic is spending so much time and money securing open source, and it’s great that they open sourced their scanning harness, that helps the industry grow and improve together. It’s also very effective bizdev and marketing for acquiring enterprise customers, but what is security research after all? :hide-the-pain-emoji:  

Have questions, comments, or feedback? Just reply directly, I’d love to hear from you.

If you find this newsletter useful and know other people who would too, I’d really appreciate if you’d forward it to them

P.S. Feel free to connect with me on LinkedIn  



Source link