We describe the development of a model to detect user-level clinical depression based on a user's temporal social media posts. Our model uses a Depression Symptoms Detection (DSD) classifier, which is trained on the largest existing samples of clinician annotated tweets for clinical depression symptoms. We subsequently use our DSD model to extract clinically relevant features, e.g., depression scores and their consequent temporal patterns, as well as user posting activity patterns, e.g., quantifying their ``no activity'' or ``silence.'' Furthermore, to evaluate the efficacy of these extracted features, we create three kinds of datasets including a test dataset, from two existing well-known benchmark datasets for user-level depression detection. We then provide accuracy measures based on single features, baseline features and feature ablation tests, at several different levels of temporal granularity. The relevant data distributions and clinical depression detection related settings can be exploited to draw a complete picture of the impact of different features across our created datasets. Finally, we show that, in general, only semantic oriented representation models perform well. However, clinical features may enhance overall performance provided that the training and testing distribution is similar, and there is more data in a user's timeline. The consequence is that the predictive capability of depression scores increase significantly while used in a more sensitive clinical depression detection settings.
翻译:我们描述了一种基于用户时间序列社交媒体帖子来检测个体层面临床抑郁症的模型开发过程。该模型采用抑郁症状检测(DSD)分类器,该分类器基于现有最大的临床医生标注推文样本进行训练,专门针对临床抑郁症状。随后,我们利用DSD模型提取临床相关特征,例如抑郁评分及其衍生出的时间模式,以及用户发帖活动模式(如量化其“无活动”或“沉默期”)。为评估这些提取特征的有效性,我们基于两个现有且著名的个体层面抑郁症检测基准数据集,构建了包含测试数据集在内的三种类型数据集。接着,我们在多个不同时间粒度层级上,基于单一特征、基准特征及特征消融测试提供了准确率指标。通过关联相关数据分布与临床抑郁检测设置,可全面刻画各特征在不同数据集上的影响。最终研究表明:通常仅语义导向的表征模型表现良好;但若训练集与测试集分布相似且用户时间线内数据量充足时,临床特征可提升整体性能。其结果是,在更敏感的临床抑郁检测设置中,抑郁评分的预测能力显著增强。