OpenAI has issued a cautionary statement that its forthcoming AI models could present “high” cybersecurity risks as their capabilities rapidly advance. The warning, published on Wednesday, noted the potential for these AI models to either develop zero-day exploits against well-defended systems or assist in enterprise or industrial intrusion operations with tangible real-world consequences.
The company, known for ChatGPT, explained that as AI capabilities grow, its models could reach levels where misuse might have an impact.
OpenAI highlighted the dual-use nature of these technologies, noting that techniques used to strengthen defenses can also be repurposed for malicious operations. “As AI capabilities advance, we are investing in strengthening models for defensive cybersecurity tasks and creating tools that enable defenders to more easily perform workflows such as auditing code and patching vulnerabilities,” the blog post stated.
To mitigate these risks, OpenAI is implementing a multi-layered strategy involving access controls, infrastructure hardening, egress controls, monitoring, and ongoing threat intelligence efforts. These protection methods are designed to go alongside the threat landscape, ensuring a quick response to new risks while preserving the utility of AI models for defensive purposes.
Assessing Cybersecurity Risks in AI Models
OpenAI noted that the cybersecurity proficiency of its AI models has improved over recent months. Capabilities measured through capture-the-flag (CTF) challenges increased from 27% on GPT‑5 in August 2025 to 76% on GPT‑5.1-Codex-Max by November 2025. The company expects this trajectory to continue and is preparing scenarios in which future models could reach “High” cybersecurity levels, as defined by its internal Preparedness Framework.
These high-level models could, for instance, autonomously develop working zero-day exploits or assist in stealthy cyber intrusions. OpenAI emphasized that its approach to safeguards combines technical measures with careful governance of model access and application. The company aims to ensure that these AI capabilities strengthen security rather than lower barriers to misuse.
Frontier Risk Council and Advisory Initiatives
In addition to technical measures, OpenAI is establishing the Frontier Risk Council, an advisory group that will bring experienced cyber defenders and security practitioners into direct collaboration with its teams. Initially focusing on cybersecurity, the council will eventually expand to other frontier AI capability domains.
Members will advise balancing useful, responsible capabilities with the potential for misuse, informing model evaluations. OpenAI is also exploring a trusted access program for qualifying users and customers working in cyber defense. This initiative aims to provide tiered access to enhanced AI capabilities while maintaining control over potential misuse.
Beyond these initiatives, OpenAI collaborates with global experts, red-teaming organizations, and the broader cybersecurity community to evaluate potential risks and improve safety measures. This includes end-to-end red teaming to simulate adversary attacks and detection systems designed to intercept unsafe activity, with escalation protocols combining automated and human review.
Dual-Use Risks and Mitigation
OpenAI stressed that cybersecurity capabilities in AI models are inherently dual-use, with offensive and defensive knowledge often overlapping. To manage this, the company employs a defense-in-depth strategy, layering protection methods such as access controls, monitoring, detection, and enforcement programs. Models are trained to refuse harmful requests while remaining effective for legitimate educational and defensive applications.
OpenAI also works through the Frontier Model Forum, a nonprofit initiative involving leading AI labs, to develop shared threat models and ecosystem-wide best practices. This collaborative approach aims to create a consistent understanding of potential attack vectors and mitigation strategies across the AI industry.
Historical Context and Risk Management
This recent warning aligns with OpenAI’s prior alerts regarding frontier risks. In April 2025, the company issued a similar caution concerning bioweapons risks, followed by the release of ChatGPT Agent in July 2025, which was assessed as “high” on risk levels. These measures reflect OpenAI’s ongoing commitment to evaluate and publicly disclose potential hazards from advanced AI capabilities.
The company’s updated Preparedness Framework categorizes AI capabilities according to risk and guides operational safeguards. It distinguishes between “High” capabilities, which could amplify existing pathways to severe harm, and “Critical” capabilities, which could create unprecedented risks. Each new AI model undergoes rigorous evaluation to ensure that it sufficiently minimizes risks before deployment.
