Research into the detection of human activities from wearable sensors is a highly active field, benefiting numerous applications, from ambulatory monitoring of healthcare patients via fitness coaching to streamlining manual work processes. We present an empirical study that compares 4 different commonly used annotation methods utilized in user studies that focus on in-the-wild data. These methods can be grouped in user-driven, in situ annotations - which are performed before or during the activity is recorded - and recall methods - where participants annotate their data in hindsight at the end of the day. Our study illustrates that different labeling methodologies directly impact the annotations' quality, as well as the capabilities of a deep learning classifier trained with the data respectively. We noticed that in situ methods produce less but more precise labels than recall methods. Furthermore, we combined an activity diary with a visualization tool that enables the participant to inspect and label their activity data. Due to the introduction of such a tool were able to decrease missing annotations and increase the annotation consistency, and therefore the F1-score of the deep learning model by up to 8% (ranging between 82.1 and 90.4% F1-score). Furthermore, we discuss the advantages and disadvantages of the methods compared in our study, the biases they may could introduce and the consequences of their usage on human activity recognition studies and as well as possible solutions.
翻译:从可穿戴传感器检测人类活动的研究是一个高度活跃的领域,惠及众多应用场景,从医疗患者的动态监测、健身指导,到简化人工工作流程等。我们提出了一项实证研究,比较了在野外数据用户研究中常用的四种不同标注方法。这些方法可分为用户驱动的现场标注(在活动记录之前或期间进行)和回忆方法(参与者在一天结束时回顾性地标注其数据)。我们的研究表明,不同的标注方法直接影响标注质量,以及基于相应数据训练的深度学习分类器的性能。我们注意到,现场方法产生的标签数量较少,但精度高于回忆方法。此外,我们将活动日记与可视化工具相结合,使参与者能够检查和标注其活动数据。通过引入该工具,我们能够减少缺失标注并提高标注一致性,从而使深度学习模型的F1分数提升高达8%(F1分数范围在82.1%至90.4%之间)。此外,我们讨论了本研究中对比方法的优缺点、它们可能引入的偏差及其在人类活动识别研究中的应用后果,并提出了可能的解决方案。