Mental manipulation on social media poses a covert yet serious threat to individuals' psychological well-being and the integrity of online interactions. Detecting such behavior is challenging due to the difficult-to-annotate training data, its highly covert and multi-turn nature, and the lack of real-world datasets. To address these challenges, we propose MentalMAD, a framework that enhances large language models for mental manipulation detection. Our approach consists of three key components: EvoSA, an annotation-free data augmentation method that combines evolutionary operations with speech-act-aware prompting; teacher-model-generated complementary-task supervision; and Complementary-Convergent Distillation, a phase-wise strategy for transferring manipulation-specific knowledge to student models. We then constructed the ReaMent dataset, comprising 5,000 real-world-sourced dialogues. Extensive experiments show that MentalMAD improves accuracy by 14.0%, macro-F1 by 27.3%, and weighted F1 by 15.1% over the strongest baseline. The code and the dataset are publicly available at https://github.com/Yuansheng-Gao/MentalMAD.
翻译:社交媒体上的心理操控对个体心理健康及在线互动完整性构成隐蔽而严重的威胁。由于训练数据标注困难、行为高度隐蔽且多轮次、以及缺乏真实世界数据集,检测此类行为极具挑战性。为应对这些挑战,我们提出MentalMAD框架,该框架通过增强大语言模型实现心理操控检测。我们的方法包含三个核心组件:EvoSA——一种结合进化操作与言语行为感知提示的无标注数据增强方法;教师模型生成的互补任务监督;以及Complementary-Convergent Distillation——一种将操控特定知识分阶段迁移至学生模型的策略。我们进一步构建了包含5,000个真实世界对话的ReaMent数据集。大量实验表明,MentalMAD在最强基线模型基础上将准确率提升14.0%,宏平均F1值提升27.3%,加权F1值提升15.1%。代码与数据集已公开于https://github.com/Yuansheng-Gao/MentalMAD。