Leveraging Large Language Models for Suicide Detection on Social Media with Limited Labels

The increasing frequency of suicidal thoughts highlights the importance of early detection and intervention. Social media platforms, where users often share personal experiences and seek help, could be utilized to identify individuals at risk. However, the large volume of daily posts makes manual review impractical. This paper explores the use of Large Language Models (LLMs) to automatically detect suicidal content in text-based social media posts. We propose a novel method for generating pseudo-labels for unlabeled data by prompting LLMs, along with traditional classification fine-tuning techniques to enhance label accuracy. To create a strong suicide detection model, we develop an ensemble approach involving prompting with Qwen2-72B-Instruct, and using fine-tuned models such as Llama3-8B, Llama3.1-8B, and Gemma2-9B. We evaluate our approach on the dataset of the Suicide Ideation Detection on Social Media Challenge, a track of the IEEE Big Data 2024 Big Data Cup. Additionally, we conduct a comprehensive analysis to assess the impact of different models and fine-tuning strategies on detection performance. Experimental results show that the ensemble model significantly improves the detection accuracy, by 5% points compared with the individual models. It achieves a weight F1 score of 0.770 on the public test set, and 0.731 on the private test set, providing a promising solution for identifying suicidal content in social media. Our analysis shows that the choice of LLMs affects the prompting performance, with larger models providing better accuracy. Our code and checkpoints are publicly available at https://github.com/khanhvynguyen/Suicide_Detection_LLMs.

翻译：自杀意念的日益频发凸显了早期检测与干预的重要性。社交媒体平台作为用户常分享个人经历和寻求帮助的场所，可用于识别潜在风险个体。然而，每日海量的发帖量使得人工审核难以实施。本文探索利用大型语言模型（LLMs）自动检测基于文本的社交媒体内容中的自杀倾向。我们提出一种新颖方法，通过提示LLMs为未标注数据生成伪标签，并结合传统的分类微调技术以提升标签准确性。为构建强效的自杀检测模型，我们开发了一种集成方法，涉及使用Qwen2-72B-Instruct进行提示，并采用微调模型如Llama3-8B、Llama3.1-8B和Gemma2-9B。我们在IEEE Big Data 2024大数据杯赛道——社交媒体自杀意念检测挑战赛的数据集上评估了我们的方法。此外，我们进行了全面分析，以评估不同模型和微调策略对检测性能的影响。实验结果表明，集成模型显著提升了检测准确率，较单一模型提高了5个百分点。在公开测试集上获得了0.770的加权F1分数，在私有测试集上获得了0.731的分数，为识别社交媒体中的自杀内容提供了一个有前景的解决方案。我们的分析表明，LLMs的选择影响提示性能，更大规模的模型能提供更好的准确性。我们的代码和检查点已公开于https://github.com/khanhvynguyen/Suicide_Detection_LLMs。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日