Introduction to Classic Deepfake Detection Models

Introduction

With the rapid development of Generative Adversarial Networks (GANs) and Diffusion Models, Deepfake content is becoming increasingly prevalent on the internet. Effectively identifying these forged images and videos has become a critical issue in multimedia forensics and information security.

Core Content

This article introduces some milestone works in the field of Deepfake detection:

1. Spatial Feature-based Models

MesoNet: Focuses on macroscopic burial artifacts in compressed facial images.
Xception-based (FaceForensics++): A representative of transfer learning, fine-tuning models pre-trained on large-scale datasets.

2. Temporal Feature-based Models

Deepfake Stacked RNN: Utilizes the continuity between video frames to capture forgery traces.

3. Frequency Domain-based Models

F3-Net: Identifies forgeries through frequency domain decomposition and frequency statistics.

4. Biological Feature-based Models

Lip-sync Check: Observing whether lip movements are synchronized with speech.
Blink Detection: Early Deepfake models often struggled to generate natural blinking.