Voice deepfakes: how they are generated, used, misused and differentiated

Source: The post is based on the article “Voice deepfakes: how they are generated, used, misused and differentiated” published in The Hindu on 7th February 2023.

What is the News?

Several users of the social media platform 4chan used “speech synthesis” and “voice cloning” service provider, ElevenLabs, to make Voice Deepfakes of celebrities like Emma Watson and Joe Rogan. These deep fake audios made racist, abusive and violent comments.

What is Voice Deepfakes?

A voice deepfake is one that closely mimics a real person’s voice. The voice can accurately replicate tonality, accents, cadence and other unique characteristics of the target person. 

People use AI and robust computing power to generate such voice clones or synthetic voices. Sometimes it can take weeks to produce such voices.

How are voice deepfakes created?

voice deepfakes
Source: SEON

To create deepfakes one needs high-end computers with powerful graphics cards, leveraging cloud computing power. 

Besides specialized tools and software, generating deepfakes need training data to be fed to AI models. These data are often original recordings of the target person’s voice. AI can use this data to render an authentic-sounding voice which can then be used to say anything.

What are the threats arising from the use of voice deepfakes?

Firstly, attackers are using such technology to defraud users, steal their identity and to engage in various other illegal activities like phone scams and posting fake videos on social media platforms.

– For instance, in 2020, a manager from a bank in the UAE, received a phone call from someone he believed was a company director. The manager recognised the voice and authorized a transfer of $35 million. The manager had no idea that the company director’s voice was cloned.

Secondly, voice deepfakes used in filmmaking have also raised ethical concerns about the use of the technology. 

Print Friendly and PDF