Train, triage, repeat: The AI agent changing how we fight phishing

June 30, 2026 3 min read

We’ve already established that artificial intelligence is raising the bar for adversaries. This is especially the case when it comes to crafting phishing messages. These days, an AI tool can make personal and well-formatted phishing emails that seem legit, even to a trained analyst. Advances in adversarial deception and the sheer volume of potential phishing emails have pushed defenders to innovate. The Anti-Phishing Working Group (APWG) observed over 3.8 million phishing attacks in 2025—with Q2 alone accounting for more than 1.1 million, the highest quarterly total in two years. At that scale, no team can tackle triage unaided. That’s why Red Canary has equipped our phishing analysts with an AI triage agent built to handle the bulk of the triage work at scale.

How does it work?

We have learned that using one catch-all agent that is ok at doing many tasks is not very reliable or scalable. For this reason, we assembled a team of orchestrated subagents, integrated as a complex graph workflow that manages each of the agentic loops, chains and deterministic nodes. Each subagent is tightly scoped to specific subtasks of an email investigation and the whole agentic workflow, paired with a feedback loop, gives us accuracy of 94%.

Email parsing and enrichment

The first subagent to see a reported email is our parsing and enrichment agent. Starting with the raw email, the subagent parses it into a standard data object to streamline analysis in the workflow. The subagent enriches the metadata with external services giving domain reputation, abuse levels and flagging other indicators from past phishing campaigns.

Traditional and AI-powered feature extraction

The next subagent in the workflow is our feature extraction agent. This subagent analyzes the parsed email and produces a set of true/false features that drive the triage process. Features come from two sources: traditional code checks that follow classic boolean logic, where a feature is true if a condition is met, and AI-powered checks where the subagent uses carefully crafted prompting to return true/false values along with reasoning. Leveraging AI for feature extraction enables much richer signals powered by Natural Language Processing (NLP), capturing sentiment, intent, and emotion, all distilled into simple true/false features.

Rules engine and deterministic outcomes

Before information reaches the classification subagent, it is first run through our rules engine. While our triage agent is highly accurate overall, AI is not perfect; the rules engine ensures deterministic outcome. Rules support both raw email data and extracted features, enabling TTP-level detection that pairs rich NLP features with atomic indicators from the email metadata. The rules engine can also be fine-tuned to fit specific customer environments, which is essential since each environment has unique characteristics that influence false positive rates. Rules can also be created from intelligence on emerging campaigns, eliminating the chance of the classification subagent missing novel phishing threats.

Hybrid AI/ML classification

When no rule matches are found, our classification subagent makes the final decision. Extracted features are used to train a classical ML model on emails previously assessed by our analysts. The model is trained exclusively on true/false feature value, no customer data or email content is ever used in training. The feature importance weights from the trained model are then added to the classification prompt, allowing the subagent to factor them into its assessment, creating a hybrid AI/ML approach. After reaching a final classification, the subagent, a reasoning deep agent, generates a summary and explanation detailing the reasoning behind its decision.

Transparent by design

Regardless of where the classification takes place, all classifications will have a category, a high level summary and a deeper explanation of the classification. The feature values and feature explanations can also be seen for those who want a deeper understanding of how the agent actually makes its decision.

Always learning

As this is a new technology, we are constantly reviewing and refining our agent with analyst-driven feedback loops. These feedback loops not only improve the agent but maintain its adaptability — with analysts at the helm, continuously shaping new features and capabilities. A hybrid approach is the great enabler in the cat and mouse game of phishing technologies, allowing the analysts to focus on the more nuanced, bespoke phishing techniques while the agent does the bulk of the work.

Source link