In recent years, deep learning methods have shown impressive results for camera-based remote physiological signal estimation, clearly surpassing traditional methods. However, the performance and generalization ability of Deep Neural Networks heavily depends on rich training data truly representing different factors of variation encountered in real applications. Unfortunately, many current remote photoplethysmography (rPPG) datasets lack diversity, particularly in darker skin tones, leading to biased performance of existing rPPG approaches. To mitigate this bias, we introduce PhysFlow, a novel method for augmenting skin diversity in remote heart rate estimation using conditional normalizing flows. PhysFlow adopts end-to-end training optimization, enabling simultaneous training of supervised rPPG approaches on both original and generated data. Additionally, we condition our model using CIELAB color space skin features directly extracted from the facial videos without the need for skin-tone labels. We validate PhysFlow on publicly available datasets, UCLA-rPPG and MMPD, demonstrating reduced heart rate error, particularly in dark skin tones. Furthermore, we demonstrate its versatility and adaptability across different data-driven rPPG methods.
翻译:近年来,深度学习在基于摄像头的远程生理信号估计方面取得了显著成果,明显超越了传统方法。然而,深度神经网络的性能和泛化能力在很大程度上依赖于能够真实反映实际应用中各种变化因素的丰富训练数据。遗憾的是,当前许多远程光电容积描记(rPPG)数据集缺乏多样性,特别是在深色肤色方面,导致现有rPPG方法存在性能偏差。为减轻此偏差,我们提出了PhysFlow,一种利用条件归一化流增强远程心率估计中肤色多样性的新方法。PhysFlow采用端到端的训练优化,能够同时在原始数据和生成数据上训练有监督的rPPG方法。此外,我们使用直接从面部视频中提取的CIELAB色彩空间肤色特征对模型进行条件化,无需肤色标签。我们在公开数据集UCLA-rPPG和MMPD上验证了PhysFlow,证明了其能够降低心率估计误差,尤其在深色肤色条件下。此外,我们还展示了该方法在不同数据驱动的rPPG方法间的通用性和适应性。