This paper describes our participation in the MentalRiskES task at IberLEF 2023. The task involved predicting the likelihood of an individual experiencing depression based on their social media activity. The dataset consisted of conversations from 175 Telegram users, each labeled according to their evidence of suffering from the disorder. We used a combination of traditional machine learning and deep learning techniques to solve four predictive subtasks: binary classification, simple regression, multiclass classification, and multiclass regression. We approached this by training a model to solve the multiclass regression case and then transforming the predictions to work for the other three subtasks. We compare the performance of two different modeling approaches: fine-tuning a BERT-based model and using sentence embeddings as inputs to a linear regressor, with the latter yielding better results. The code to reproduce our results can be found at: https://github.com/simonsanvil/EarlyDepression-MentalRiskES.
翻译:本文描述了我们在IberLEF 2023 MentalRiskES任务中的参与情况。该任务旨在根据个体的社交媒体活动预测其罹患抑郁症的可能性。数据集包含175名Telegram用户的对话记录,每条数据均根据其表现出的抑郁症症状标注。我们综合运用传统机器学习与深度学习技术解决了四个预测子任务:二分类、简单回归、多分类以及多类回归。我们的方法是通过训练一个模型解决多类回归问题,再将其预测结果转换适配至其他三个子任务。我们比较了两种建模方法的性能:基于BERT模型的微调技术,以及将句子嵌入作为线性回归器输入的方法,后者取得了更优的结果。可复现实验结果的代码详见:https://github.com/simonsanvil/EarlyDepression-MentalRiskES。