Video emotion recognition is an important branch of affective computing, and its solutions can be applied in different fields such as human-computer interaction (HCI) and intelligent medical treatment. Although the number of papers published in the field of emotion recognition is increasing, there are few comprehensive literature reviews covering related research on video emotion recognition. Therefore, this paper selects articles published from 2015 to 2023 to systematize the existing trends in video emotion recognition in related studies. In this paper, we first talk about two typical emotion models, then we talk about databases that are frequently utilized for video emotion recognition, including unimodal databases and multimodal databases. Next, we look at and classify the specific structure and performance of modern unimodal and multimodal video emotion recognition methods, talk about the benefits and drawbacks of each, and then we compare them in detail in the tables. Further, we sum up the primary difficulties right now looked by video emotion recognition undertakings and point out probably the most encouraging future headings, such as establishing an open benchmark database and better multimodal fusion strategys. The essential objective of this paper is to assist scholarly and modern scientists with keeping up to date with the most recent advances and new improvements in this speedy, high-influence field of video emotion recognition.
翻译:视频情感识别是情感计算的重要分支,其解决方案可广泛应用于人机交互(HCI)与智能医疗等领域。尽管情感识别领域发表论文数量持续增长,但涵盖视频情感识别相关研究的系统性文献综述仍较为匮乏。为此,本文选取2015至2023年间发表的研究成果,系统梳理视频情感识别的现有发展趋势。首先探讨两种典型情感模型,继而阐述视频情感识别中常用的数据库(包括单模态与多模态数据库)。随后对现代单模态与多模态视频情感识别方法的具体架构与性能进行审视分类,分析各方法的优劣,并通过表格进行详细对比。进一步总结当前视频情感识别研究面临的核心挑战,指出最具前景的未来方向,如构建开放基准数据库与优化多模态融合策略。本文旨在帮助学术界与工业界研究者及时掌握视频情感识别这一高影响力快速发展领域的最新进展与创新成果。