The classification of mental health is challenging for a variety of reasons. For one, there is overlap between the mental health issues. In addition, the signs of mental health issues depend on the context of the situation, making classification difficult. Although fine-tuning transformers has improved the performance for mental health classification, standard cross-entropy training tends to create entangled feature spaces and fails to utilize all the information the transformers contain. We present a new framework that focuses on representations to improve mental health classification. This is done using two methods. First, \textbf{layer-attentive residual aggregation} which works on residual connections to to weigh and fuse representations from all transformer layers while maintaining high-level semantics. Second, \textbf{supervised contrastive feature learning} uses temperature-scaled supervised contrastive learning with progressive weighting to increase the geometric margin between confusable mental health problems and decrease class overlap by restructuring the feature space. With a score of \textbf{74.36\%}, the proposed method is the best performing on the SWMH benchmark and outperforms models that are domain-specialized, such as \textit{MentalBERT} and \textit{MentalRoBERTa} by margins of (3.25\% - 2.2\%) and 2.41 recall points over the highest achieving model. These findings show that domain-adaptive pretraining for mental health text classification can be surpassed by carefully designed representation geometry and layer-aware residual integration, which also provide enhanced interpretability through learnt layer importance.
翻译:心理健康分类因多种原因而具有挑战性。一方面,各类心理健康问题之间存在重叠性;另一方面,心理健康问题的表征往往依赖于具体情境,这进一步增加了分类难度。尽管基于Transformer模型的微调方法已提升了心理健康分类的性能,但标准的交叉熵训练容易导致特征空间纠缠,且未能充分利用Transformer模型所包含的全部信息。本文提出一种专注于表征学习的新框架以改进心理健康分类。该框架通过两种方法实现:首先,**层注意力残差聚合**方法作用于残差连接,通过对所有Transformer层的表征进行加权融合,同时保持高层语义信息;其次,**监督对比特征学习**采用温度调节的监督对比学习与渐进加权策略,通过重构特征空间来增大易混淆心理健康问题之间的几何间隔,减少类别重叠。本方法在SWMH基准测试中取得了**74.36%** 的最高分数,其性能优于领域专用模型(如_MentalBERT_和_MentalRoBERTa_),准确率提升幅度达(3.25%-2.2%),召回率较最优模型提升2.41个百分点。这些结果表明:通过精心设计的表征几何结构与层感知残差集成,可以超越针对心理健康文本分类的领域自适应预训练方法,同时通过学习得到的层重要性权重提供了更强的可解释性。