The rapid growth of live-streaming platforms such as Twitch has introduced complex challenges in moderating toxic behavior. Traditional moderation approaches, such as human annotation and keyword-based filtering, have demonstrated utility, but human moderators on Twitch constantly struggle to scale effectively in the fast-paced, high-volume, and context-rich chat environment of the platform while also facing harassment themselves. Recent advances in large language models (LLMs), such as DeepSeek-R1-Distill and Llama-3-8B-Instruct, offer new opportunities for toxicity detection, especially in understanding nuanced, multimodal communication involving emotes. In this work, we present an exploratory comparison of toxicity detection approaches tailored to Twitch. Our analysis reveals that incorporating emotes improves the detection of toxic behavior. To this end, we introduce ToxiTwitch, a hybrid model that combines LLM-generated embeddings of text and emotes with traditional machine learning classifiers, including Random Forest and SVM. In our case study, the proposed hybrid approach reaches up to 80 percent accuracy under channel-specific training (with 13 percent improvement over BERT and F1-score of 76 percent). This work is an exploratory study intended to surface challenges and limits of emote-aware toxicity detection on Twitch.
翻译:随着Twitch等直播平台的快速发展,针对有害行为的审核工作面临着日益复杂的挑战。传统审核方法(如人工标注和基于关键词的过滤)已显示出其实用价值,但在Twitch平台快节奏、高流量、富含上下文的聊天环境中,人工审核员始终难以实现高效扩展,同时他们自身也面临着骚扰问题。以DeepSeek-R1-Distill和Llama-3-8B-Instruct为代表的大语言模型(LLMs)的最新进展,为有害内容检测提供了新的机遇,尤其是在理解包含表情符号的微妙多模态交流方面。本研究针对Twitch平台,对有害内容检测方法进行了探索性比较。我们的分析表明,纳入表情符号信息能够提升有害行为的检测效果。为此,我们提出了ToxiTwitch——一种混合模型,该模型将LLM生成的文本与表情符号嵌入向量,与包括随机森林和支持向量机(SVM)在内的传统机器学习分类器相结合。在我们的案例研究中,所提出的混合方法在特定频道训练下达到了最高80%的准确率(相比BERT提升了13%,F1分数为76%)。本研究是一项探索性工作,旨在揭示Twitch平台上基于表情符号的有害内容检测所面临的挑战与局限。