Deception detection is an interdisciplinary field attracting researchers from psychology, criminology, computer science, and economics. We propose a multimodal approach combining deep learning and discriminative models for automated deception detection. Using video modalities, we employ convolutional end-to-end learning to analyze gaze, head pose, and facial expressions, achieving promising results compared to state-of-the-art methods. Due to limited training data, we also utilize discriminative models for deception detection. Although sequence-to-class approaches are explored, discriminative models outperform them due to data scarcity. Our approach is evaluated on five datasets, including a new Rolling-Dice Experiment motivated by economic factors. Results indicate that facial expressions outperform gaze and head pose, and combining modalities with feature selection enhances detection performance. Differences in expressed features across datasets emphasize the importance of scenario-specific training data and the influence of context on deceptive behavior. Cross-dataset experiments reinforce these findings. Despite the challenges posed by low-stake datasets, including the Rolling-Dice Experiment, deception detection performance exceeds chance levels. Our proposed multimodal approach and comprehensive evaluation shed light on the potential of automating deception detection from video modalities, opening avenues for future research.
翻译:欺骗检测是一个跨学科领域,吸引着心理学、犯罪学、计算机科学和经济学等领域的研究者。我们提出一种融合深度学习与判别模型的 multimodal 方法用于自动欺骗检测。利用视频模态,我们采用卷积端到端学习方法分析视线、头部姿态和面部表情,与现有最优方法相比取得了显著成果。由于训练数据有限,我们还采用判别模型进行欺骗检测。尽管探索了序列到分类方法,但受数据稀缺性影响,判别模型表现更优。我们在五个数据集上评估了该方法,包括一个受经济因素启发的新型骰子滚动实验。结果表明,面部表情的检测效果优于视线和头部姿态,而结合模态信息与特征选择可提升检测性能。不同数据集间表达特征的差异强调了场景特定训练数据的重要性以及情境对欺骗行为的影响。跨数据集实验进一步印证了这些发现。尽管面临低风险数据集(包括骰子滚动实验)带来的挑战,欺骗检测性能仍优于随机水平。我们提出的 multimodal 方法与综合评估揭示了基于视频模态自动检测欺骗的潜力,为未来研究开辟了新方向。