AI security needs a shift from models to systems, researchers argue

May 25, 2026 1 min read

“The AI model powering the agent must be treated as an untrusted component,” the researchers wrote in the paper, warning that “semantic guardrails” and prompt-level defenses alone cannot reliably secure systems once agents gain access to enterprise tools, memory, APIs, browsers, and execution environments.

The authors drew the comparison to operating systems. “Similar to how an operating system treats a process as untrusted, we take the stance that the model powering the agent should be treated as untrusted and security properties should be expressed and enforced outside, at the level of the encompassing system,” they wrote.

The paper was written by researchers at Google, the University of California, San Diego, the University of Wisconsin-Madison, and other institutions, including Mihai Christodorescu, Earlence Fernandes, and Somesh Jha.

Five principles from systems security

The authors distilled five principles from decades of systems security research that they said agentic systems should follow: least privilege, tamper resistance of the trusted computing base, complete mediation, secure information flow, and accounting for the human as a weak link.

Source link