Generative Artificial Intelligence (AI) has limitless potential but is equally exploitable. This has become evident with the development of WormGPT, followed by DarkBART, which is being termed the “dark version of the Google Bard”.
The recent development of AI can be applied across a wide range of fields and industries, and its development and possibilities are continuously expanding.
However, an alarming trend has emerged with the rise of video and voice deepfakes, where hyper-realistic videos and voice files are created, making it challenging to discern between genuine and fabricated content.
The use of voice deepfakes poses a particularly significant threat, as they can convincingly mimic the speech patterns and tone of the person being portrayed.
This technology has the potential to cause substantial financial losses, tarnish reputations, and even propagate misinformation on a large scale.
Although the use of deepfakes is not a novel concept, advancements in generative AI have made detecting their misuse exceptionally challenging.
This calls for an urgent need for voice deepfake detection solutions to protect individuals, organizations, and society at large from these emerging threats.
Do you think you can differentiate between fake and genuine content?
Whether you possess a basic understanding of distinguishing authenticity or not, this article aims to provide you with valuable clues to help identify the difference..
Deepfake, a portmanteau of “deep learning” and “fake, is formed by manipulating fake content to deeply resemble the genuine samples added to the tool. In the case of deepfake voices, the content is generated using voice samples to imitate and sound like someone else.
It could be exploited in multiple ways including making voice calls to individuals, colleagues, or other targets of the cyber crime to ask for sensitive detail.
Voice Deepfake Detection, Need of the Hour
The accessibility of user-friendly tools for creating deepfake audio and video raises an important question: “How can we detect voice deepfakes?”
Detecting voice deepfakes requires paying close attention to details and staying informed about the latest developments in the cybersecurity domain.
In a report by Kaspersky, the cybersecurity company highlighted ways to detect Voice Deepfake–
- Fraudulent calls with threats, requests, or questions may often be of poor quality with distractive noise from the background.
- Monotony in speech is another way to detect voice deepfake.
- Pronunciation of certain words may be confusing. The bot may not be able to differentiate between TA and Threat Actor and hence end up pronouncing words for abbreviations.
- Voice deepfake messages sent either through email, calls, or whatsapp among others may sound urgent requiring immediate action like sharing passwords or account details.
It is always advisable to call or contact the person concerned to verify the authenticity of the message and demand or request.
In a voice deepfake scam in 2021, a criminal called a UK-based energy firm posing as the Chief Executive Officer of its German parent company.
They asked for a money transfer while speaking in a German accent in the voice of the CEO which was accepted by the UK CEO.
The transfer was made, and losses incurred. The UK CEO did sense something amiss when they found that a second call asking for a money transfer was made from an Austrian number.
Voice deepfake detection tools are being developed with better sensing of fraudulent audio that can be used in the form of speeches by politicians circulated in the news, or statements made by celebrities alleging another celebrity of fraud.
Just like the voice deepfake technology, voice deepfake detection tech is also evolving to accommodate newer tactics used by criminals to sound near perfect. However, so far, perfection has not been attained.
“To determine whether some audio piece is a fake or a speech of a real human, there are several characteristics to consider: the timbre, manner and intonation of speech,” Kaspersky noted.
Addressing the technological needs for voice deepfakes, Dmitry Anikin, Senior Data Scientist at Kaspersky said, “Currently the technology for creating high-quality deepfakes is not available for widespread use.”
“Most likely, attackers will try to generate voices in real time – to impersonate someone’s relative and lure out money, for example,” Dmitry added.
For voice deepfake detection it is important not to take decisions based on emotions or a sense of urgency as created by criminals.
Don’t share details or money with anyone and stop the call to cross-check the authenticity of the scammer to make sure it is the right person on the other end, the report concluded.
A voice deepfake message, if ignored, may not directly impact security. Therefore, it is advisable to exercise caution and refrain from responding with requested data, as it could potentially lead to real security issues. It is better to err on the side of caution and wait for further verification before taking any action.