The U.K. government published guidance on AI, open code, and vulnerability risk in the public sector, outlining how agencies can safely publish source code while reducing the risk of AI-accelerated vulnerability discovery. The guidance addresses growing concerns among technology leaders over whether advances in AI-assisted code analysis should force public sector organizations to move away from publishing source code openly by default.
The guidance noted that the primary driver of exploitation risk is not the publication of source code itself, but the existence of weaknesses in systems, including unpatched vulnerabilities, insecure implementation, and unsafe configuration or deployment practices. User research found that open publication may slightly reduce attacker uncertainty and accelerate analysis, particularly when combined with AI capabilities, but these risks become far more significant when organizations cannot rapidly remediate flaws or maintain systems effectively.
The document reinforces that securely operating publicly accessible services already requires a minimum level of operational maturity, including timely patching, secure deployment, continuous maintenance, and effective vulnerability management. While AI may increase the speed of vulnerability discovery, the guidance argues that stopping open publication of code by default would not address the root causes of exploitation risk.
The U.K. recommends setting a minimum standard for publicly accessible systems, requiring clear ownership, secure-by-design practices, automated hygiene, and credible remediation capability, while explicitly noting that privacy should not be treated as a substitute control. They suggest keeping systems open by default, since making everything private introduces additional delivery and policy overhead and can reduce both reuse and scrutiny, with openness remaining the baseline and closure applied only sparingly and deliberately.
Where exceptions are necessary, they should be explicitly defined and reviewable, requiring a short threat model that identifies the attacker, explains what publication adds, and outlines a realistic path to harm, with all exceptions kept narrow, time-bound, and subject to periodic re-approval.
The guidance stresses need to strengthen remediation capability by assuming shorter discovery-to-exploit windows, enforcing patch SLAs, automating dependency and vulnerability management, and ensuring teams can respond quickly to external reports, a requirement that applies equally whether code is open or closed.
The U.K. government clarified that the minimum operational capability already assumed under existing open-by-default and secure-by-design guidance, not new security expectations. At minimum, systems must have a named owner, a documented maintenance plan, and a clearly identified accountable team, visible through CODEOWNERS files or equivalent controls, covering the repository, dependencies, and support lifecycle. A defined security contact and intake process is required, with clear vulnerability reporting instructions via a SECURITY file or monitored mailbox, backed by a structured triage workflow.
Repositories must contain no secrets or sensitive operational data, with enforced controls preventing committed credentials and the removal of environment-specific information where exposure would materially increase exploitability. Systems must demonstrate a secure-by-design baseline, including threat modelling, secure defaults, least privilege, and hardened public interfaces, since operational hygiene cannot compensate for fundamentally unsafe architecture.
Automated hygiene must be in place, with dependency update tooling, vulnerability and secret scanning, and protected branch controls that prevent force pushes and require critical infrastructure checks. Clear patching timelines must be defined and evidenced for critical and high-severity vulnerabilities. Unmaintained repositories must be clearly marked and archived, with associated services either decommissioned or assigned an explicit owner and patching route.
Making code private is not an acceptable mitigation for gaps in ownership, patching capability, or operational assurance. Covering systems that cannot be safely maintained must be remediated or retired. Where teams cannot meet this minimum standard, leadership must address the underlying gap by resourcing the capability, typically through shared services, or by retiring systems no longer required; no live service may exist without an explicit owner and patching pathway. Only once these requirements are met may leaders apply the existing exception rule, and only where publishing otherwise well-maintained code would create a specific and credible route to harm.
The recommendations set the default posture for organizations. Providing additional context for leadership decisions, the government listed couple of considerations that highlight common pitfalls of closing by default and outline practical factors that influence real-world risk, whether code is public or private. Private repositories can create a false sense of security by encouraging security-by-obscurity thinking and reducing the urgency to address underlying weaknesses.
The government called for closing code after publication may not remove exposure, since code developed in the open can remain accessible to capable adversaries through mirrors, forks, prior indexing, or earlier cloning by researchers or attackers, even after it is made private. It observed that closure can become a one-way decision, as private repositories reduce reuse and external scrutiny and, over time, lead to divergence between internal and external versions of the codebase, making it significantly harder to re-publish safely and confidently later.
The same tools and practices used for offensive discovery can be applied defensively, and as discovery accelerates, security must rely on continuous review, testing, and remediation, where openness reinforces discipline while lack of scrutiny does not eliminate defects and may allow vulnerabilities to persist.
Openness can surface issues earlier by enabling a broader set of reviewers across government and the supplier ecosystem, whereas closing code concentrates discovery within delivery teams and internal monitoring processes. Precedent also matters, as broad AI-related justifications for closure can be easily replicated across organizations and, once normalized, can undermine cross-government coherence on reuse, transparency, and consistent engineering standards.
Commenting on the NCSC guidance, Rajeev Raghunarayan, head of GTM at Averlon, wrote in an emailed statement that it “gets the core issue right: agentic AI changes the risk model because these systems don’t just generate answers, they take actions. That means organizations need to think carefully about what an agent can access, what actions it can perform, what inputs can influence it, and who is accountable when something goes wrong.”
“Identity and permissions are the obvious starting point, but network access deserves equal attention. An agent’s ability to reach the internet, download tools, connect to APIs, or execute code can be just as consequential,” according to Raghunarayan. “Agents designed to solve problems dynamically may pull down packages, scripts, or tools when they don’t know how to complete a task natively. That creates a much more complex attack surface than static permissions alone.”
Recognizing one of the big challenges in agentic systems is accountability, Steven Swift, managing director at Suzu Labs, wrote that even if a human is supposed to be accountable for an agentic system failing, it’s extremely easy for them to deflect blame to the AI. “In practice, this means that the accountable human needs to present the appearance of responsible guardrails in place for the agentic system. This mostly works as a deflection strategy, though if failures are frequent and significant enough to affect change, it is still helpful to have an accountable party put under pressure to improve incident posture.”
He also pointed out that agentic systems are not secure by design. “LLMs will have tacked on some safety training, though this varies by model, and is primarily focused on general safety issues. Meaning it will never be app-specific for any of your agentic systems.”
“One of the most common failures in agentic systems is by not treating LLM output as equivalent to untrusted user input,” Swift warned. “There are a lot of parallels between legacy security issues that arise from untrusted user input and new security issues from trusting LLM output. The primary reason here is that agentic systems by design, take LLM output and use it as input to the next step of the system. Thus, LLM output IS ACTUALLY input for further processing. That means it’s vulnerable to intentional and unintentional variations on intended behavior.”


