Background: Studies have shown the potential adverse health effects, ranging from headaches to cardiovascular disease, associated with long-term negative emotions and chronic stress. Since many indicators of stress are imperceptible to observers, the early detection and intervention of stress remains a pressing medical need. Physiological signals offer a non-invasive method of monitoring emotions and are easily collected by smartwatches. Existing research primarily focuses on developing generalized machine learning-based models for emotion classification. Objective: We aim to study the differences between personalized and generalized machine learning models for three-class emotion classification (neutral, stress, and amusement) using wearable biosignal data. Methods: We developed a convolutional encoder for the three-class emotion classification problem using data from WESAD, a multimodal dataset with physiological signals for 15 subjects. We compared the results between a subject-exclusive generalized, subject-inclusive generalized, and personalized model. Results: For the three-class classification problem, our personalized model achieved an average accuracy of 95.06% and F1-score of 91.71, our subject-inclusive generalized model achieved an average accuracy of 66.95% and F1-score of 42.50, and our subject-exclusive generalized model achieved an average accuracy of 67.65% and F1-score of 43.05. Conclusions: Our results emphasize the need for increased research in personalized emotion recognition models given that they outperform generalized models in certain contexts. We also demonstrate that personalized machine learning models for emotion classification are viable and can achieve high performance.
翻译:背景:研究表明,长期负面情绪与慢性压力可能引发从头痛到心血管疾病等不良健康效应。由于压力的许多指标对观察者而言难以察觉,压力的早期检测与干预仍是迫切的医疗需求。生理信号提供了一种非侵入性的情绪监测方法,且易于通过智能手表收集。现有研究主要侧重于开发基于机器学习的通用情绪分类模型。目的:本研究旨在利用可穿戴生物信号数据,探讨个性化与通用化机器学习模型在三类情绪分类(中性、压力、愉悦)中的差异。方法:我们基于包含15名受试者生理信号的多模态数据集WESAD,开发了一种用于三类情绪分类问题的卷积编码器。我们比较了排除受试者的通用模型、包含受试者的通用模型以及个性化模型的结果。结果:在三类分类问题中,我们的个性化模型实现了平均准确率95.06%和F1分数91.71,包含受试者的通用模型实现了平均准确率66.95%和F1分数42.50,排除受试者的通用模型实现了平均准确率67.65%和F1分数43.05。结论:我们的结果强调,鉴于个性化模型在特定情境下优于通用模型,需要加强针对个性化情感识别模型的研究。同时,我们证明用于情感分类的个性化机器学习模型是可行的,且能够实现高性能。