AI can write your code, but nearly half of it may be insecure
While GenAI excels at producing functional code, it introduces security vulnerabilities in 45 percent of cases, according to Veracode’s 2025 GenAI Code Security Report, which analyzed code produced by over 100 LLMs across 80 real-world coding tasks.
Vibe coding
“The rise of vibe coding, where developers rely on AI to generate code, typically without explicitly defining security requirements, represents a fundamental shift in how software is built,” said Jens Wessling, CTO at Veracode. “The main concern with this trend is that they do not need to specify security constraints to get the code they want, effectively leaving secure coding decisions to LLMs. Our research reveals GenAI models make the wrong choices nearly half the time, and it’s not improving.”
AI is enabling attackers to identify and exploit security vulnerabilities quicker. Tools powered by AI can scan systems at scale, identify weaknesses, and even generate exploit code with minimal human input. This lowers the barrier to entry for less-skilled attackers and increases the speed and sophistication of attacks, posing a threat to traditional security defenses. Not only are vulnerabilities increasing, but the ability to exploit them is becoming easier.
LLMs introduce widespread security vulnerabilities in code
To assess the security of AI-generated code, Veracode created 80 coding tasks designed to expose common software weaknesses as defined by the MITRE Common Weakness Enumeration (CWE) system. Each task prompted more than 100 LLMs to complete code snippets, offering the opportunity to choose a secure or insecure implementation.
The findings were stark: in 45 percent of all test cases, LLMs produced code containing vulnerabilities aligned with the OWASP Top 10, a list of the most serious web application security risks.
Some languages proved especially problematic. Java had the highest failure rate, with LLM-generated code introducing security flaws more than 70 percent of the time. Python, C# and JavaScript were not far behind, with failure rates between 38 and 45 percent.
LLMs also struggled with specific vulnerability types. Eighty-six percent of code samples failed to defend against cross-site scripting (CWE-80), and 88 percent were vulnerable to log injection attacks (CWE-117).
“Our research shows models are getting better at coding accurately but are not improving at security. We also found larger models do not perform significantly better than smaller models, suggesting this is a systemic issue rather than an LLM scaling problem,” Wessling said.
Securing the AI-driven software pipeline
While GenAI development practices like vibe coding accelerate productivity, they also amplify risks. Researchers emphasize that organizations need a risk management program that prevents vulnerabilities before they reach production—by integrating code quality checks and automated fixes directly into the development workflow.
As organizations increasingly leverage AI-powered development, Veracode recommends taking the following proactive measures to ensure security:
- Integrate AI-powered tools into developer workflows to remediate security risks in real time.
- Leverage static analysis to detect flaws early and automatically, preventing vulnerable code from advancing through development pipelines.
- Embed security in agentic workflows to automate policy compliance and ensure AI agents enforce secure coding standards.
- Use Software Composition Analysis (SCA) to ensure AI-generated code does not introduce vulnerabilities from third-party dependencies and open-source components.
- Adopt AI-driven remediation guidance to empower developers with fix instructions and train them to use the recommendations.
- Deploy a firewall to automatically detect and block malicious packages, vulnerabilities, and policy violations.
Source link