Unveiling the Role of ChatGPT in Detecting Deepfakes

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot...

When most people think of artificial intelligence, they probably think of—and worry about—ChatGPT and deepfakes. An example of ChatGPT’s analysis of deepfake images. The large language model was less accurate than state-of-the-art deepfake detectors, but impressed researchers with its ability to explain its analysis in plain language.

Spotting Deepfakes with LLMs

AI-generated text and images dominate our social media feeds and the other websites we visit, sometimes without our knowledge, and they are often used to spread unreliable and misleading information. But what if text-generating models like ChatGPT could spot deepfake images?

A University at Buffalo-led research team has applied large language models (LLMs), including OpenAI’s ChatGPT and Google’s Gemini, toward spotting deepfakes of human faces. Their study, presented last week at the IEEE/CVF Conference on Computer Vision & Pattern Recognition, found that LLMs’ performance lagged behind that of state-of-the-art deepfake detection algorithms, but their natural language processing may actually make them the more practical detection tool in the future.

Explaining Findings in Plain Language

“What sets LLMs apart from existing detection methods is the ability to explain their findings in a way that’s comprehensible to humans, like identifying an incorrect shadow or a mismatched pair of earrings,” says the study’s lead author, Siwei Lyu, PhD, SUNY Empire Innovation Professor in the Department of Computer Science and Engineering, within the UB School of Engineering and Applied Sciences.

The future landscape of large language models in medicine ...

Collaborators on the study include the University at Albany and the Chinese University of Hong Kong, Shenzhen. The work was supported by the National Science Foundation.

Connecting Words and Images

Trained on much of the available text on the internet—amounting to some 300 billion words—ChatGPT finds statistical patterns and relationships between words in order to generate responses. The latest versions of ChatGPT and other LLMs can also analyze images. These multimodal LLMs use large databases of captioned photos to find the relationships between words and images.

New AI and Large Language Model Tools for Journalists: What to ...

“Humans do this as well. Whether it be a stop sign or a viral meme, we constantly assign a semantic description to images,” says the study’s first author, Shan Jai, assistant lab director in the UB Media Forensic Lab. “In this way, images become their own language.”

Accuracy in Detecting AI-generated Images

The Media Forensics Lab team decided to test if GPT-4 with vision (GPT-4V) and Gemini 1.0 could tell the difference between real faces and faces generated by AI. They gave it thousands of images of both real and deepfake faces and asked it to identify any potential signs of manipulation or synthetic artifacts.

Fortifying Digital Customer Onboarding Against Deepfakes

ChatGPT was accurate 79.5% of the time on detecting synthetic artifacts in images generated by latent diffusion, and 77.2% of the time on StyleGAN-generated images. “This is comparable to earlier deepfake detection methods, so with proper prompt guidance, ChatGPT can do a fairly decent job at detecting AI-generated images,” says Lyu, who is also co-director of the UB Center for Information Integrity.

Explanation and User-friendliness

More crucially, ChatGPT could explain its decision making in plain language. When provided an AI-generated photo of a man with glasses, the model correctly pointed out certain features indicating manipulation.

Existing deepfake detection models will tell us the probability of an image being real or fake, but they will very rarely tell us why they came to this conclusion. And even if we look into the model’s underlying mechanisms, there will be features that we simply can’t understand. Meanwhile, everything ChatGPT outputs is understandable to humans.

Common Sense Understanding

That’s because ChatGPT bases its analysis on semantic knowledge alone. Whereas traditional deepfake detection algorithms distinguish real from fake by training on large datasets of images labeled real or fake, LLMs’ natural language abilities give them something of a common sense understanding of reality—including the typical symmetry of human faces and the look of real photographs.

“Once the vision component of ChatGPT understands an image as a human face, the language component can make the inference that a face will typically have two eyes, and so on,” Lyu says. “The language component provides a deeper connection between visual and verbal concepts.”

User-friendly Deepfake Tool

ChatGPT’s semantic knowledge and natural language processing make it a more user-friendly deepfake tool for both users and developers, the study concluded.

ChatGPT’s performance was well below the latest deepfake detection algorithms, which have accuracy rates in the mid- to high-90s. This was partly because LLMs can’t catch signal-level statistical differences that are invisible to the human eye but often used by detection algorithms to spot AI-generated images. And other LLMs may not be as effective at explaining their analysis.

Ultimately, the study highlights the potential of ChatGPT and similar models to aid in the detection of deepfake content, providing explanations in plain language for better understanding.