With the exponential increase in video content, the need for accurate deception detection in human-centric video analysis has become paramount. This research focuses on the extraction and combination of various features to enhance the accuracy of deception detection models. By systematically extracting features from visual, audio, and text data, and experimenting with different combinations, we developed a robust model that achieved an impressive 99% accuracy. Our methodology emphasizes the significance of feature engineering in deception detection, providing a clear and interpretable framework. We trained various machine learning models, including LSTM, BiLSTM, and pre-trained CNNs, using both single and multi-modal approaches. The results demonstrated that combining multiple modalities significantly enhances detection performance compared to single modality training. This study highlights the potential of strategic feature extraction and combination in developing reliable and transparent automated deception detection systems in video analysis, paving the way for more advanced and accurate detection methodologies in future research.
翻译:随着视频内容的指数级增长,在以人为本的视频分析中实现精准欺骗检测的需求变得至关重要。本研究聚焦于多种特征的提取与组合,以提升欺骗检测模型的准确性。通过系统性地从视觉、音频和文本数据中提取特征,并尝试不同的组合方式,我们开发了一个鲁棒的模型,取得了令人印象深刻的99%的准确率。我们的方法强调了特征工程在欺骗检测中的重要性,提供了一个清晰且可解释的框架。我们使用单模态和多模态两种方法训练了多种机器学习模型,包括LSTM、BiLSTM以及预训练的CNN。结果表明,与单模态训练相比,组合多种模态能显著提升检测性能。本研究凸显了在视频分析中,通过策略性的特征提取与组合来开发可靠且透明的自动化欺骗检测系统的潜力,为未来研究中更先进、更精准的检测方法铺平了道路。