基于MentalRoBERTa的自杀风险评估分层双头模型 (Hierarchical Dual-Head Model for Suicide Risk Assessment via MentalRoBERTa)

Social media platforms have become important sources for identifying suicide risk, but automated detection systems face multiple challenges including severe class imbalance, temporal complexity in posting patterns, and the dual nature of risk levels as both ordinal and categorical. This paper proposes a hierarchical dual-head neural network based on MentalRoBERTa for suicide risk classification into four levels: indicator, ideation, behavior, and attempt. The model employs two complementary prediction heads operating on a shared sequence representation: a CORAL (Consistent Rank Logits) head that preserves ordinal relationships between risk levels, and a standard classification head that enables flexible categorical distinctions. A 3-layer Transformer encoder with 8-head multi-head attention models temporal dependencies across post sequences, while explicit time interval embeddings capture posting behavior dynamics. The model is trained with a combined loss function (0.5 CORAL + 0.3 Cross-Entropy + 0.2 Focal Loss) that simultaneously addresses ordinal structure preservation, overconfidence reduction, and class imbalance. To improve computational efficiency, we freeze the first 6 layers (50%) of MentalRoBERTa and employ mixed-precision training. The model is evaluated using 5-fold stratified cross-validation with macro F1 score as the primary metric.

翻译：社交媒体平台已成为识别自杀风险的重要来源，但自动化检测系统面临多重挑战，包括严重的类别不平衡、发帖模式的时间复杂性，以及风险等级兼具序数和分类属性的双重性质。本文提出一种基于MentalRoBERTa的分层双头神经网络，用于将自杀风险分为四个等级：指标、意念、行为和企图。该模型采用两个互补的预测头，作用于共享的序列表示：一个CORAL（一致秩逻辑）头用于保持风险等级间的序数关系，另一个标准分类头用于实现灵活的分类区分。一个具有8头多头注意力机制的3层Transformer编码器对帖子序列中的时间依赖性进行建模，而显式的时间间隔嵌入则捕捉发帖行为的动态特征。模型采用组合损失函数（0.5 CORAL + 0.3 交叉熵 + 0.2 焦点损失）进行训练，该函数同时处理序数结构保持、过度自信减少和类别不平衡问题。为提高计算效率，我们冻结了MentalRoBERTa的前6层（50%），并采用混合精度训练。模型使用以宏观F1分数为主要指标的5折分层交叉验证进行评估。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日