Intensive care units (ICUs) are complex and data-rich environments. Data routinely collected in the ICUs provides tremendous opportunities for machine learning, but their use comes with significant challenges. Complex problems may require additional input from humans which can be provided through a process of data annotation. Annotation is a complex, time-consuming process that requires domain expertise and technical proficiency. Existing data annotation tools fail to provide an effective solution to this problem. In this study, we investigated clinicians' approach to the annotation task. We focused on establishing the characteristics of the annotation process in the context of clinical data and identifying differences in the annotation workflow between different staff roles. The overall goal was to elicit requirements for a software tool that could facilitate an effective and time-efficient data annotation. We conducted an experiment involving clinicians from the ICUs annotating printed sheets of data. The participants were observed during the task and their actions were analysed in the context of Norman's Interaction Cycle to establish the requirements for the digital tool. The annotation process followed a constant loop of annotation and evaluation, during which participants incrementally analysed and annotated the data. No distinguishable differences were identified between how different staff roles annotate data. We observed preferences towards different methods for applying annotation which varied between different participants and admissions. We established 11 requirements for the digital data annotation tool for the healthcare setting. We conducted a manual data annotation activity to establish the requirements for a digital data annotation tool, characterised the clinicians' approach to annotation and elicited 11 key requirements for effective data annotation software.
翻译:重症监护室是复杂且数据密集的环境。重症监护室常规收集的数据为机器学习提供了巨大机遇,但其使用面临重大挑战。复杂问题可能需要人类额外输入,这些输入可通过数据标注过程实现。标注是一个复杂、耗时的过程,需要领域专业知识和熟练技术。现有数据标注工具未能为此问题提供有效解决方案。在本研究中,我们调查了临床医生对标注任务的处理方式,重点确定了临床数据背景下标注过程的特征,并识别了不同工作人员角色在标注工作流程中的差异。总体目标是为一款能够实现高效且省时的数据标注的软件工具捕获需求。我们开展了一项实验,涉及重症监护室临床医生对打印数据表格进行标注。在任务过程中对参与者进行观察,并依据诺曼交互循环理论分析其行为,以确立数字工具的需求。标注过程遵循标注与评估的持续循环,参与者在此过程中逐步分析和标注数据。未发现不同工作人员角色在数据标注方式上存在显著差异。我们观察到参与者对标注方法存在偏好差异,这些偏好因参与者不同和入院情况不同而变化。我们为医疗环境下的数字数据标注工具确立了11项需求。我们通过开展手动数据标注活动,为数字数据标注工具确立了需求,描述了临床医生的标注方法,并提炼出有效数据标注软件的11项关键需求。