Respiration is a critical vital sign for infants, and continuous respiratory monitoring is particularly important for newborns. However, neonates are sensitive and contact-based sensors present challenges in comfort, hygiene, and skin health, especially for preterm babies. As a step toward fully automatic, continuous, and contactless respiratory monitoring, we develop a deep-learning method for estimating respiratory rate and waveform from plain video footage in natural settings. Our automated infant respiration flow-based network (AIRFlowNet) combines video-extracted optical flow input and spatiotemporal convolutional processing tuned to the infant domain. We support our model with the first public annotated infant respiration dataset with 125 videos (AIR-125), drawn from eight infant subjects, set varied pose, lighting, and camera conditions. We include manual respiration annotations and optimize AIRFlowNet training on them using a novel spectral bandpass loss function. When trained and tested on the AIR-125 infant data, our method significantly outperforms other state-of-the-art methods in respiratory rate estimation, achieving a mean absolute error of $\sim$2.9 breaths per minute, compared to $\sim$4.7--6.2 for other public models designed for adult subjects and more uniform environments.
翻译:呼吸是婴儿的关键生命体征,持续性呼吸监测对新生儿尤为重要。然而,早产儿等婴儿皮肤敏感,接触式传感器在舒适性、卫生和皮肤健康方面存在挑战,尤其对早产儿而言。为迈向全自动、连续、无接触的呼吸监测,我们开发了一种深度学习方法,能够从自然环境的普通视频中估算呼吸频率和呼吸波形。我们的自动化婴儿呼吸流网络(AIRFlowNet)结合了视频提取的光流输入和针对婴儿领域调优的时空卷积处理。我们通过首个公开带标注的婴儿呼吸数据集(含125段视频的AIR-125)为模型提供支撑,该数据集涵盖8名婴儿受试者,包含不同姿态、光照和相机条件。我们加入了人工呼吸标注,并利用新型频谱带通损失函数优化AIRFlowNet训练。当在AIR-125婴儿数据集上训练和测试时,我们的方法在呼吸频率估计上显著优于其他现有最优方法,平均绝对误差约为每分钟2.9次呼吸,而为成人受试者及更均匀环境设计的其他公开模型该误差范围为每分钟4.7-6.2次呼吸。