AI Application Gives Users Real-Time Feedback on the Level of Peace in the Social Media Videos They Watch

P. Gilda,P. Dungarwal,A. Thongkham,E. T. Ajayi,S. Choudhary,T. M. Terol,C. Lam,J. P. Araujo,M. McFadyen-Mungalln,L. S. Liebovitch,P. T. Coleman,H. West,K. Sieck,S. Carter

from arxiv, 6 pages, 4 figures, corrected typos, minor edits; v3: 16 pages, improved title, abstract, introduction, discussion, conclusions, added more references

Most people now get their news from videos on social media, such as YouTube and Facebook, rather than through curated journalism. "We become what we behold." The content and tone of language plays an essential role in starting or ending conflicts. "Hate Speech" can enhance conflict, "Peace Speech" can enhance peace. We developed an application that measures, in real time, these aspects of speech from YouTube videos, which can give users helpful feedback on their own media diet. We used two approaches: 1) supervised machine learning. Language in the text of online news media text was tagged by surveys that measure the level of peace in those countries. One fully connected feedforward and 2 convolutional neural networks trained on that data were $\sim 97\%$ accurate in predicting levels of peace in the test set and $\sim 70\%$ accurate in another distinct news text data set, but did not generalize to YouTube videos, suggesting that written text is different than transcribed spoken language. 2) social science dimensions. There is no similar external data to tag the text in the YouTube video transcripts. We therefore used 2 word-level sentiment analysis (SA) and 6 context-level large language models (LLMs) to measure 5 social dimensions in peace identified by 59 social science studies: compassion-contempt, news-opinion, promotion-prevention, creativity-order, nuance-simplification. LLMs more closely matched the values by 3 human coders on 52 videos, $r^2\sim0.60$ than SA, at $r^2\sim0.03$. Results: LLMs successfully measured social dimensions important in peace in YouTube videos, compared to human coders. These results serve as the basis of an analysis engine that can give users and content creators feedback on their own media diet and creations.

翻译：如今大多数人通过社交媒体（如YouTube和Facebook）上的视频获取新闻，而非通过经过筛选的新闻媒体。"我们观看的内容塑造了我们。"语言的内容和基调在引发或结束冲突中起着关键作用。"仇恨言论"可能加剧冲突，而"和平言论"则能促进和平。我们开发了一款应用，能够实时测量YouTube视频中这些言语特征，为用户提供关于自身媒体摄入的有益反馈。我们采用两种方法：1）监督式机器学习。通过调查测量各国和平程度的问卷数据，对在线新闻媒体文本中的语言进行标注。基于该数据训练的1个全连接前馈神经网络和2个卷积神经网络在测试集上对和平水平的预测准确率约为97%，在另一个不同的新闻文本数据集中准确率约为70%，但未能泛化到YouTube视频，表明书面文本与转录后的口语存在差异。2）社会科学维度。由于缺乏类似的外部数据来标注YouTube视频转录文本，我们分别使用2种词级情感分析和6种上下文级大型语言模型，测量了59项社会科学研究确定的和平五维度：同情-蔑视、新闻-观点、促进-预防、创造-秩序、微妙-简化。在52个视频中，大型语言模型与3名人类编码者的评分匹配度（r²≈0.60）显著高于情感分析（r²≈0.03）。结果表明：与人类编码者相比，大型语言模型成功测量了YouTube视频中与和平相关的重要社会维度。这些结果为分析引擎奠定了基础，使用户和内容创作者能够获得关于自身媒体摄入与创作的反馈。