The emergence of artificial emotional intelligence technology is revolutionizing the fields of computers and robotics, allowing for a new level of communication and understanding of human behavior that was once thought impossible. While recent advancements in deep learning have transformed the field of computer vision, automated understanding of evoked or expressed emotions in visual media remains in its infancy. This foundering stems from the absence of a universally accepted definition of "emotion", coupled with the inherently subjective nature of emotions and their intricate nuances. In this article, we provide a comprehensive, multidisciplinary overview of the field of emotion analysis in visual media, drawing on insights from psychology, engineering, and the arts. We begin by exploring the psychological foundations of emotion and the computational principles that underpin the understanding of emotions from images and videos. We then review the latest research and systems within the field, accentuating the most promising approaches. We also discuss the current technological challenges and limitations of emotion analysis, underscoring the necessity for continued investigation and innovation. We contend that this represents a "Holy Grail" research problem in computing and delineate pivotal directions for future inquiry. Finally, we examine the ethical ramifications of emotion-understanding technologies and contemplate their potential societal impacts. Overall, this article endeavors to equip readers with a deeper understanding of the domain of emotion analysis in visual media and to inspire further research and development in this captivating and rapidly evolving field.
翻译:人工智能情感技术的兴起正革新计算机与机器人领域,实现了曾被认为不可能的人类行为交流与理解的新层次。尽管深度学习的最新进展已深刻改变了计算机视觉领域,但对视觉媒体中诱发或表达情感的自动理解仍处于初期阶段。这一困境源于缺乏普遍接受的“情感”定义,加上情感本身的主观性及其细微差别。本文综合心理学、工程学与艺术视角,对视觉媒体情感分析领域进行了多学科全面综述。我们首先探讨情感的心理基础及从图像与视频中理解情感所依赖的计算原理,继而回顾该领域的最新研究与系统,重点关注最具潜力的方法。同时讨论当前情感分析的技术挑战与局限,强调持续研究与创新的必要性。我们认为这代表了计算领域的“圣杯”研究问题,并勾勒出未来探究的关键方向。最后,我们审视情感理解技术的伦理影响并思考其潜在社会后果。总体而言,本文旨在帮助读者深入理解视觉媒体情感分析领域,并激发这一引人入胜且快速发展的领域的进一步研究。