This work introduces a new multispectral database and novel approaches for eyeblink detection in RGB and Near-Infrared (NIR) individual images. Our contributed dataset (mEBAL2, multimodal Eye Blink and Attention Level estimation, Version 2) is the largest existing eyeblink database, representing a great opportunity to improve data-driven multispectral approaches for blink detection and related applications (e.g., attention level estimation and presentation attack detection in face biometrics). mEBAL2 includes 21,100 image sequences from 180 different students (more than 2 million labeled images in total) while conducting a number of e-learning tasks of varying difficulty or taking a real course on HTML initiation through the edX MOOC platform. mEBAL2 uses multiple sensors, including two Near-Infrared (NIR) and one RGB camera to capture facial gestures during the execution of the tasks, as well as an Electroencephalogram (EEG) band to get the cognitive activity of the user and blinking events. Furthermore, this work proposes a Convolutional Neural Network architecture as benchmark for blink detection on mEBAL2 with performances up to 97%. Different training methodologies are implemented using the RGB spectrum, NIR spectrum, and the combination of both to enhance the performance on existing eyeblink detectors. We demonstrate that combining NIR and RGB images during training improves the performance of RGB eyeblink detectors (i.e., detection based only on a RGB image). Finally, the generalization capacity of the proposed eyeblink detectors is validated in wilder and more challenging environments like the HUST-LEBW dataset to show the usefulness of mEBAL2 to train a new generation of data-driven approaches for eyeblink detection.
翻译:本文介绍了一种新的多光谱数据库以及基于RGB和近红外(NIR)单帧图像进行眨眼检测的创新方法。我们贡献的数据集(mEBAL2,多模态眨眼与注意力水平估计第二版)是现有规模最大的眨眼数据库,为改进数据驱动的多光谱眨眼检测方法及其相关应用(如人脸生物特征中的注意力水平估计和呈现攻击检测)提供了重要机遇。mEBAL2包含来自180名不同学生的21,100个图像序列(总计超过200万张标注图像),这些学生在执行难度不同的电子学习任务或通过edX MOOC平台参与HTML入门真实课程时被采集。该数据库采用多传感器系统,包括两个近红外传感器和一个RGB摄像头,用于捕捉任务执行过程中的面部姿态,同时配备脑电图(EEG)头带获取用户的认知活动与眨眼事件。此外,本文提出了一种卷积神经网络架构作为mEBAL2上的眨眼检测基准,其性能可达97%。我们分别采用RGB光谱、NIR光谱以及两者融合的不同训练方法,以提升现有眨眼检测器的性能。实验证明,在训练过程中融合NIR和RGB图像可显著提高RGB眨眼检测器(即仅基于RGB图像进行检测)的性能。最后,我们在更具挑战性的复杂环境(如HUST-LEBW数据集)中验证了所提眨眼检测器的泛化能力,展示了mEBAL2在训练新一代数据驱动的眨眼检测方法中的实用价值。