This paper explores automated face and facial landmark detection of neonates, which is an important first step in many video-based neonatal health applications, such as vital sign estimation, pain assessment, sleep-wake classification, and jaundice detection. Utilising three publicly available datasets of neonates in the clinical environment, 366 images (258 subjects) and 89 (66 subjects) were annotated for training and testing, respectively. Transfer learning was applied to two YOLO-based models, with input training images augmented with random horizontal flipping, photo-metric colour distortion, translation and scaling during each training epoch. Additionally, the re-orientation of input images and fusion of trained deep learning models was explored. Our proposed model based on YOLOv7Face outperformed existing methods with a mean average precision of 84.8% for face detection, and a normalised mean error of 0.072 for facial landmark detection. Overall, this will assist in the development of fully automated neonatal health assessment algorithms.
翻译:本文探索了新生儿面部及面部特征点的自动检测方法,这是多项基于视频的新生儿健康应用(如生命体征估计、疼痛评估、睡眠-觉醒分类及黄疸检测)中的重要初始步骤。利用三个公开的新生儿临床环境数据集,标注了366幅图像(含258名受试者)用于训练,89幅图像(含66名受试者)用于测试。将迁移学习应用于两个基于YOLO的模型,并在每个训练周期中通过随机水平翻转、光度色彩畸变、平移和缩放对输入训练图像进行增强。此外,还探索了输入图像的重定向及已训练深度学习模型的融合。我们基于YOLOv7Face提出的模型在面部检测中实现了84.8%的平均精度均值,在面部特征点检测中取得了0.072的归一化平均误差,显著优于现有方法。总体而言,这将促进全自动新生儿健康评估算法的开发。