There are dozens of tools designed to detect whether an image or video is real or an AI-generated deepfake – but none of them are actually up to the job, CSIRO researchers have found after extensive testing that exposed the tools’ dangerous shortcomings.

Working with researchers at South Korea’s Sungkyunkwan University (SKKU), the CSIRO team analysed 51 leading deepfake detectors and tested 16 against a variety of deepfakes to see how effectively they could distinguish real from AI-generated content.

They tested three types of content: synthesis, which combines multiple elements into one; face swaps, which juxtapose one person’s face onto another’s body; and reenactment, which co-opts a target’s face to create videos of things they never said.

The researchers tested the likes of DeepFaceLab, Dfaker, Faceswap, LightWeight, FOM-Animation, FOM-Faceswap, and FSGAN against third-party testing sets DFDC and Celeb-DF, as well as creating 250 deepfakes and sourcing 1,383 deepfake videos online.

Over the course of the testing, the researchers identified 18 different factors that affect the efficacy of deepfake detectors, some of which proved effective in carefully controlled testing that played to their strengths.

Yet the detectors all fell over when applied to ‘in-the-wild’ content, regularly turning in “non-competitive” detection rates from 39 per cent to 69 per cent – with the average around 55 per cent, akin to flipping a coin.

A key difference is the data that the detectors are trained on: while some detectors could, for example, pick out deepfakes of celebrities they were trained to recognise, those same detectors proved all but useless in detecting deepfakes of other people.

Other detectors struggled to identify even common objects in dark or grainy videos – one of the many shortcomings of today’s deepfake detectors, which the researchers classified by the 13 different methodologies they use.

This fed the creation of a five-step framework that identifies 18 factors the team called “essential” for deepfake detection – including deepfake type, detection methodology, data and preprocessing techniques, model and training techniques, and validation.

“Deepfakes are increasingly deceptive and capable of spreading misinformation, so there is an urgent need for more adaptable and resilient solutions to detect them,” CSIRO cybersecurity expert Dr Sharif Abuadbba said.

Pinning down a growing problem

Given that generative AI (genAI) technology can now easily make photorealistic deepfake images, high-quality videos, copycat voices and talking heads – which have already facilitated multi million-dollar scams – improving detection tools is crucial.

“Relying on human ability to detect synthetic media is rarely effective,” research firm Gartner has noted, warning businesses to bolster defences against deepfakes so good that they will render identity verification systems unreliable by next year.

Security firm Trend Micro, for one, believes deepfakes are so good that we’ll soon see malicious ‘digital twins’ of real people as scammers train AI systems using their knowledge, personality, and writing style – producing believable digital henchmen.

“This happens more frequently than a lot of people realise,” Rob Greig, chief information officer with British engineering firm Arup, noted after a deepfake tricked its executives into sending $40 million ($US25 million) to criminals.

Social media, videoconferencing, and livestreaming sessions “are lucrative targets” for cybercriminals, a recent Forrester report warns in advising businesses to urgently beef up their deepfake detection defences.

Businesses should, it advises, tap techniques like spectral artefact analysis – which identifies unnatural audio and video patterns – as well as generative adversarial networks and liveness detection to pick whether an onscreen image is of a real person.

Better detection tools are on the way

As the new CSIRO-SKKU study has confirmed, understanding the shortcomings of existing deepfake detection tools is crucial if there’s any hope of catching up with AI tools that are getting better and more convincing by the week.

By “exposing major vulnerabilities” in current tools, SKKU professor Simon S Woo said, the study “has deepened our understanding of how deepfake detectors perform in real-world conditions… paving the way for more resilient solutions.”

To improve their effectiveness, CSIRO cybersecurity expert Dr Kristen Moore said, better deepfake detectors will need to incorporate a range of data sets including audio, text, images, and metadata – as well as using synthetic data and contextual analysis.

Content ‘fingerprinting’ techniques will also be important, with many legitimate genAI vendors adopting digital watermarks that identify content as deepfakes – or, conversely, whose absence will become a tipoff that the content may be fake.

“By breaking down detection methods info their fundamental components and subjecting them to rigorous testing with real-world deepfakes,” Abuadbba said, “we’re enabling the development of tools better equipped to counter a range of scenarios.”