A Comprehensive Review of Visual-Textual Sentiment Analysis from Social Media Networks

Social media networks have become a significant aspect of people's lives, serving as a platform for their ideas, opinions and emotions. Consequently, automated sentiment analysis (SA) is critical for recognising people's feelings in ways that other information sources cannot. The analysis of these feelings revealed various applications, including brand evaluations, YouTube film reviews and healthcare applications. As social media continues to develop, people post a massive amount of information in different forms, including text, photos, audio and video. Thus, traditional SA algorithms have become limited, as they do not consider the expressiveness of other modalities. By including such characteristics from various material sources, these multimodal data streams provide new opportunities for optimising the expected results beyond text-based SA. Our study focuses on the forefront field of multimodal SA, which examines visual and textual data posted on social media networks. Many people are more likely to utilise this information to express themselves on these platforms. To serve as a resource for academics in this rapidly growing field, we introduce a comprehensive overview of textual and visual SA, including data pre-processing, feature extraction techniques, sentiment benchmark datasets, and the efficacy of multiple classification methodologies suited to each field. We also provide a brief introduction of the most frequently utilised data fusion strategies and a summary of existing research on visual-textual SA. Finally, we highlight the most significant challenges and investigate several important sentiment applications.

翻译：社交媒体网络已成为人们生活中不可或缺的组成部分,为人们表达思想、观点和情感提供了平台。因此,自动化情感分析对于识别其他信息源无法捕捉的人们情感至关重要。对这些情感的分析揭示了多种应用场景,包括品牌评价、YouTube影片评论和医疗健康应用。随着社交媒体的持续发展,用户以文本、图片、音频和视频等不同形式发布海量信息。传统的基于文本的情感分析算法由于未考虑其他模态的表现力而显得局限。通过整合不同信息源的特征,多模态数据流为超越纯文本情感分析提供了优化预期结果的新机遇。本研究聚焦于多模态情感分析的前沿领域,重点分析社交媒体网络中的视觉与文本数据。许多用户更倾向于在这些平台上通过多模态信息表达自我。为服务于这一快速发展的学术领域,我们系统综述了文本与视觉情感分析的研究进展,涵盖数据预处理、特征提取技术、情感基准数据集以及适用于各领域多种分类方法的效能评估。同时简要介绍最常用的数据融合策略,并总结现有视觉-文本情感分析研究成果。最后,重点指出当前面临的主要挑战,并探讨若干重要的情感分析应用场景。