AI-powered Vishing – Cyber Defense Magazine

First, there was phishing. The goal: To trick targets into revealing information or completing unauthorized actions. Around since the 1990s, this attack vector remains the top internet crime reported to the FBI, partly because of its effectiveness in exploiting human emotions when pretexting.

After all, you can’t simply issue a patch or apply a firewall to human instinct. For example, how a new starter may feel under pressure to skip usual security procedures after being told, ‘You need to download this file for your onboarding.’ Or how a junior payroll administrator is unlikely to challenge someone who says they’re the regional finance director and needs an invoice to be paid immediately.’

There’s huge power in manipulating people’s feelings and responses to trust, fear, and urgency. These forms of social engineering threats are going nowhere. In fact, they’re becoming more advanced, as shown by the rise in voice phishing, otherwise known as vishing.

In the past, vishing may have been identified by its use of automated robotic voices, or that the actual voice on the phone didn’t sound like the person they were impersonating. That’s all changing because of AI.

How AI is accelerating vishing use cases and capabilities

Vishing has traditionally been labor-intensive. First, selecting the target, and applying various psychological techniques to verbally encourage them to fulfill the attack request. This needs a skilled manipulator of human emotions. Someone capable of improvising during a live call, and knowing when to threaten, coerce, or even impersonate a colleague or third-party contact.

You’d also need supporting tactics and technologies, such as caller ID spoofing to use a phone number the target would recognize. Or crawling the target’s social media channels, to look for clues that would help build rapport during a call. Maybe referencing their favorite sports team, or a recent holiday destination.

Of course, vishing also relies on the target believing and trusting a voice they probably have never heard before. At least, until AI technology started being used for vishing. Now attackers have a new threat vector, adding a new dimension to attacks, and asking new questions of enterprise defenses.

Advances in deception methods with AI technology

It’s now possible to clone someone’s voice with as little as 15 seconds of audio, whether that’s taken from online videos or recorded on a live phone call. Enough time for the AI to capture and learn from their vocal nuances, inflections, and tones.

Cyber criminals can then use the cloned audio for text-to-speech vishing attacks during real-time two-way conversations. Our voices are about as unique as fingerprints, plus we’ve had thousands of years of evolution where we’re used to listening out for voices we recognize, without thinking, ‘Is that really who I think it is?’ So, it’s not practical to expect people to suddenly stop trusting voice as an authentication factor.

That’s because AI can also be deployed to bring context to conversations. Attackers can scrape the web in real-time, often using a mix of OSINT techniques and capabilities, to feed the results into an LLM. This allows the AI to communicate with relevancy and recency, such as mentioning recent news to add an authentic-sounding layer to requests.

You can see the impact of this in a study of an AI-automated vishing attack simulation, which extracted sensitive data from 77% of participants. This was in part due to the chosen LLM offering ‘advanced capabilities in context understanding, response generation speed, and fluency in conversation, which is crucial in giving the illusion of a real-time conversation in a phone call.’

Scaling attacks with AI-powered vishing

This AI-driven approach is a long way from traditional vishing call centers, with human agents making calls from physical workstations. Cyber criminals can now launch hyper personalized 1:1 attack at scale, personalizing messages to multiple targets at once if needed. And then simply update language based on the AI’s learned inputs, rather than needing to be reprogrammed if an attack is unsuccessful.

Many of these AI voice cloning tools are open source, opening up free or low-cost access to a wider range of malicious actors across the world. As development accelerates, the nature of vishing is evolving too, with AI allowing attackers to deploy with what the FBI describes as ‘unprecedented realism’. In Q4 of 2023, vishing attacks rose by a reported 260% year-on-year. Meanwhile, in early 2024 a deepfake video led to a finance worker paying out $25 million to someone they thought was their CFO.

Attackers may often gain a first-mover advantage with emerging technologies, leaving defense teams racing to catch up and build systems to counter. However, they still rely on human vulnerabilities to breach successfully. So that’s where businesses should start, by taking a people-centric approach to defending against AI-driven vishing.

How to tackle AI vishing’s first-mover advantage

It starts with training the workforce to be aware of vishing and AI developments, helping them recognize when their emotions are potentially being triggered. Especially if that includes a request to share data, passwords, or other sensitive information.

Along with the theory, it also means giving them practice at being the subject of a vishing attack. After all, people may forget what a lecturer or online course says about AI vishing. They’re more likely to remember the experience of AI vishing and thinking, ‘What if that had been a real attack.’

This allows employees to reflect and learn new behaviors and thought processes. They don’t have to repress emotions and thoughts that attackers want to exploit. They just learn to know when to take a step back during a call, and think: Am I being pressured to complete this action for this person? Am I being asked to believe something they say without any proof? Shall I end the call and call the person back on their main office number?

Simulation helps lighten the load on cybersecurity teams too. They can’t always stay ahead of the latest vishing strategies, and developing protection systems such as voice biometrics takes time. Whereas educating employees with simulation and real-world training can be an effective and immediate defense alternative.

About the Author

Thomas LE COZ is the CEO of Arsen. He develops solutions to reduce the impact of social engineering in cyberattacks. By simulating attacks ranging from regular phishing to voice-clone vishing, Arsen provides a complete platform to evaluate, train and automate behavior improvements of the workforce with a “learn-by-doing” approach. Thomas can be reached online on LinkedIn and at our company website https://arsen.co

Source link