Non-Contact sensing is an emerging technology with applications across many industries from driver monitoring in vehicles to patient monitoring in healthcare. Current state-of-the-art implementations focus on RGB video, but this struggles in varying/noisy light conditions and is almost completely unfeasible in the dark. Near Infra-Red (NIR) video, however, does not suffer from these constraints. This paper aims to demonstrate the effectiveness of an alternative Convolution Attention Network (CAN) architecture, to regress photoplethysmography (PPG) signal from a sequence of NIR frames. A combination of two publicly available datasets, which is split into train and test sets, is used for training the CAN. This combined dataset is augmented to reduce overfitting to the 'normal' 60 - 80 bpm heart rate range by providing the full range of heart rates along with corresponding videos for each subject. This CAN, when implemented over video cropped to the subject's head, achieved a Mean Average Error (MAE) of just 0.99 bpm, proving its effectiveness on NIR video and the architecture's feasibility to regress an accurate signal output.
翻译:非接触式感知作为一种新兴技术,在众多行业中具有广泛的应用前景,从车辆驾驶员监控到医疗患者监护。当前最先进的实现方案多基于RGB视频,但该方法在光照变化强烈或存在噪声的环境中表现不佳,在黑暗条件下几乎完全不可行。相比之下,近红外(NIR)视频不受这些限制。本文旨在验证一种替代性卷积注意力网络(CAN)架构的有效性,该架构能够从近红外帧序列中回归光电容积描记(PPG)信号。研究采用两个公开数据集的组合,将其划分为训练集和测试集用于CAN模型训练。为减少模型对"正常"60-80次/分钟心率范围的过拟合,该组合数据集通过提供完整心率范围及对应受试者视频进行了增强。当该CAN模型应用于裁剪至受试者头部的视频时,实现了仅0.99次/分钟的平均绝对误差(MAE),充分证明了其在近红外视频上的有效性以及该架构对回归精确信号输出的可行性。