Adaptive mobile device-based health interventions often use machine learning models trained on non-mobile device data, such as social media text, due to the difficulty and high expense of collecting large text message (SMS) data. Therefore, understanding the differences and generalization of models between these platforms is crucial for proper deployment. We examined the psycho-linguistic differences between Facebook and text messages, and their impact on out-of-domain model performance, using a sample of 120 users who shared both. We found that users use Facebook for sharing experiences (e.g., leisure) and SMS for task-oriented and conversational purposes (e.g., plan confirmations), reflecting the differences in the affordances. To examine the downstream effects of these differences, we used pre-trained Facebook-based language models to estimate age, gender, depression, life satisfaction, and stress on both Facebook and SMS. We found no significant differences in correlations between the estimates and self-reports across 6 of 8 models. These results suggest using pre-trained Facebook language models to achieve better accuracy with just-in-time interventions.
翻译:基于移动设备的自适应健康干预通常使用机器学习模型,这些模型在非移动设备数据(如社交媒体文本)上训练,原因是收集大量短信数据既困难又昂贵。因此,理解这些平台之间的差异及模型在这些平台上的泛化能力对于正确部署至关重要。我们以120名同时使用Facebook和短信的用户为样本,研究了Facebook与短信在心理语言学上的差异及其对域外模型性能的影响。研究发现,用户使用Facebook分享体验(如休闲活动),而使用短信进行任务导向和对话交流(如计划确认),这反映了平台功能特性的差异。为检验这些差异的下游影响,我们使用预训练的基于Facebook的语言模型估计用户在Facebook和短信上的年龄、性别、抑郁情绪、生活满意度和压力水平。在8个模型中的6个中,我们未发现模型估计值与自我报告数据之间的相关性存在显著差异。这些结果表明,使用预训练的Facebook语言模型可以更有效地实现即时干预的准确性提升。