ComputerWeekly

Cyber experts take an optimistic view of AI-powered hacking


The annual showcase at the Centre for Emerging Technology and Security (CETaS) kicked off with a discussion on the implications of Claude Mythos. 

Opening the conference, Alexander (Sacha) Babuta, director of CETaS at the Alan Turing Institute, said that Anthropic’s latest frontier model, Claude Mythos Preview, demonstrates major improvements in mathematics, cyber security, software engineering and automated vulnerability detection.

While the model can identify and autonomously exploit previously undiscovered vulnerabilities in real-world systems, he described an optimistic outlook of how Claude Mythos Preview could be used to secure enterprise IT. “Companies can use models like Anthropic Mythos to rapidly discover vulnerabilities in their own systems and patch them to strengthen digital security for everyone,” said Babuta. 

A study of the cyber crime community between the release of ChatGPT in 2022 and the end of 2025 revealed that cyber crime forums played host to a number of “dark AI” products.

These are claimed by their owners to be homegrown or extensively retrained and jailbroken large language models (LLMs) customised and tailored for cyber crime. But despite generating some early enthusiasm on the forums, these have made little impact to date, Ben Collier, senior lecturer at the University of Edinburgh, said in a presentation discussing the findings.

When the researchers looked at enterprise-grade, legitimate products designed explicitly to turn a novice developer into a competent coder, they found many aspiring cyber criminals experimenting with tools like ChatGPT and Claude, which the researchers said “excitedly report back on their discoveries”. However, Collier noted that a deeper exploration of these discussions found that, in most cases, forum members lacked the basic technical skills needed to use AI tools effectively for committing cyber crime.

“They’re using vibe coding tools for hobby projects, but particularly for the basic logistics of cyber crime operations,” he said. “Most of the coding involved in cyber crime isn’t hacking. It’s the same administration and basic engineering works that you’d need for any small startup, which means a lot of them don’t actually need to jailbreak Claude to get real utility out of it.”

The pessimistic view is that as these tools evolve, they will be able to be used for sophisticated cyber attacks. Adam Beaumont, interim director at the AI Security Institute (ASI), discussed the pessimist view. Beaumont, the former chief AI officer at GCHQ, said the ASI recently demonstrated how a frontier AI model executed a 32-step cyber attack against a simulated corporate environment from initial reconnaissance through to full network takeover.

“We estimate it would take a skilled human professional 20 hours’ worth of work, and this was the first time any model had done it, and weeks later, we tested a second model,” he said.

Beaumont pointed out that the attack he described was not a model answering a question about hacking. “It was a system that hacked,” he said. “We still don’t fully know how to ensure these systems act as we intend, or how to guarantee they remain under meaningful human control as they grow more capable.”

Beaumont called the ASI demonstration an “honest starting point”. “The uncertainty is real and the discomfort is appropriate,” he said.

For Beaumont, it represents something that can be built up to enable government, industry and the research community to make decisions based on what these systems can actually do built on evidence.



Source link