A Machine Learning Approach for Detection of Mental Health Conditions and Cyberbullying from Social Media

from arxiv, Oral Presentation at the AAAI-26 Bridge Program on AI for Medicine and Healthcare. To appear in Proceedings of Machine Learning Research (PMLR)

Mental health challenges and cyberbullying are increasingly prevalent in digital spaces, necessitating scalable and interpretable detection systems. This paper introduces a unified multiclass classification framework for detecting ten distinct mental health and cyberbullying categories from social media data. We curate datasets from Twitter and Reddit, implementing a rigorous "split-then-balance" pipeline to train on balanced data while evaluating on a realistic, held-out imbalanced test set. We conducted a comprehensive evaluation comparing traditional lexical models, hybrid approaches, and several end-to-end fine-tuned transformers. Our results demonstrate that end-to-end fine-tuning is critical for performance, with the domain-adapted MentalBERT emerging as the top model, achieving an accuracy of 0.92 and a Macro F1 score of 0.76, surpassing both its generic counterpart and a zero-shot LLM baseline. Grounded in a comprehensive ethical analysis, we frame the system as a human-in-the-loop screening aid, not a diagnostic tool. To support this, we introduce a hybrid SHAPLLM explainability framework and present a prototype dashboard ("Social Media Screener") designed to integrate model predictions and their explanations into a practical workflow for moderators. Our work provides a robust baseline, highlighting future needs for multi-label, clinically-validated datasets at the critical intersection of online safety and computational mental health.

翻译：心理健康问题与网络欺凌在数字空间中日益普遍，亟需可扩展且可解释的检测系统。本文提出了一种统一的多类别分类框架，用于从社交媒体数据中检测十种不同的心理健康与网络欺凌类别。我们整合了来自Twitter和Reddit的数据集，采用严格的"先划分后平衡"流程，在平衡数据上进行训练，同时在保留的真实不平衡测试集上进行评估。通过系统比较传统词法模型、混合方法及多种端到端微调Transformer模型，我们发现端到端微调对性能提升至关重要。其中领域自适应模型MentalBERT表现最优，准确率达到0.92，宏观F1分数为0.76，超越了其通用版本及零样本大语言模型基线。基于全面的伦理分析，我们将该系统定位为"人在回路"的筛查辅助工具而非诊断工具。为此，我们提出了混合SHAP-LLM可解释性框架，并开发了原型仪表板（"社交媒体筛查器"），旨在将模型预测及其解释整合到审核人员的实际工作流程中。本研究建立了稳健的基线，强调未来需要在网络安全与计算心理健康的关键交叉领域，构建多标签、经临床验证的数据集。

相关内容

健康

关注 27

健康是指一个人在身体、精神和社会等方面都处于良好的状态。健康包括两个方面的内容：

一是主要脏器无疾病，身体形态发育良好，体形均匀，人体各系统具有良好的生理功能，有较强的身体活动能力和劳动能力，这是对健康最基本的要求；

二是对疾病的抵抗能力较强，能够适应环境变化，各种生理刺激以及致病因素对身体的作用。传统的健康观是“无病即健康”，现代人的健康观是整体健康，世界卫生组织提出“健康不仅是躯体没有疾病，还要具备心理健康、社会适应良好和有道德”。因此，现代人的健康内容包括：躯体健康、心理健康、心灵健康、社会健康、智力健康、道德健康、环境健康等。健康是人的基本权利。健康是人生的第一财富。

《网络安全中的机器学习算法：网络防护与攻击检测》最新报告

专知会员服务

20+阅读 · 2025年6月24日

中文版 | 数字战场：人工智能如何作为主动防护盾对抗网络欺凌

专知会员服务

9+阅读 · 2025年5月22日

《利用大型语言模型检测社交平台上的网络欺凌行为》

专知会员服务

44+阅读 · 2024年1月23日

《综述：基于博弈论和机器学习的防御性欺骗方法》

专知会员服务

51+阅读 · 2022年10月2日