Interpretable facial dynamics as behavioral and perceptual traces of deepfakes

Deepfake detection research has largely converged on deep learning approaches that, despite strong benchmark performance, offer limited insight into what distinguishes real from manipulated facial behavior. This study presents an interpretable alternative grounded in bio-behavioral features of facial dynamics and evaluates how computational detection strategies relate to human perceptual judgments. We identify core low-dimensional patterns of facial movement, from which temporal features characterizing spatiotemporal structure were derived. Traditional machine learning classifiers trained on these features achieved modest but significant above-chance deepfake classification, driven by higher-order temporal irregularities that were more pronounced in manipulated than real facial dynamics. Notably, detection was substantially more accurate for videos containing emotive expressions than those without. An emotional valence classification analysis further indicated that emotive signals are systematically degraded in deepfakes, explaining the differential impact of emotive dynamics on detection. Furthermore, we provide an additional and often overlooked dimension of explainability by assessing the relationship between model decisions and human perceptual detection. Model and human judgments converged for emotive but diverged for non-emotive videos, and even where outputs aligned, underlying detection strategies differed. These findings demonstrate that face-swapped deepfakes carry a measurable behavioral fingerprint, most salient during emotional expression. Additionally, model-human comparisons suggest that interpretable computational features and human perception may offer complementary rather than redundant routes to detection.

翻译：深度伪造检测研究在很大程度上已经趋同于深度学习方法，尽管这些方法在基准测试中表现强劲，但在区分真实与操纵的人脸行为方面提供的见解有限。本研究提出了一种基于人脸动态生物行为特征的可解释替代方案，并评估了计算检测策略与人类感知判断之间的关系。我们识别了人脸运动的核心低维模式，并从中提取了表征时空结构的时序特征。基于这些特征训练的传统机器学习分类器取得了适度但显著的优于随机水平的深度伪造分类效果，其驱动力来自操纵人脸动态中比真实动态更明显的高阶时序不规则性。值得注意的是，对于包含表情的视频，检测准确率显著高于不包含表情的视频。情感效价分类分析进一步表明，深度伪造中的情感信号被系统性降级，这解释了情感动态对检测的差异性影响。此外，我们通过评估模型决策与人类感知检测之间的关系，提供了通常被忽视的可解释性维度。模型和人类判断在情感视频上趋于一致，但在非情感视频上出现分歧，即使输出结果一致，底层检测策略也存在差异。这些发现表明，换脸深度伪造带有可测量的行为指纹，在情感表达期间最为显著。此外，模型与人类的比较表明，可解释的计算特征与人类感知可能提供互补而非冗余的检测途径。