Traditional video-induced physiological datasets usually rely on whole-trial labels, which introduce temporal label noise in dynamic emotion recognition. We present FIRMED, a peak-centered multimodal dataset based on an immediate-recall annotation paradigm, with synchronized EEG, ECG, GSR, PPG, and facial recordings from 35 participants. FIRMED provides event-centered timestamps, emotion labels, and intensity annotations, and its annotation quality is supported by subjective and physiological validation. Benchmark experiments show that FIRMED consistently outperforms whole-trial labeling, yielding an average gain of 3.8 percentage points across eight EEG-based classifiers, with further improvements under multimodal fusion. FIRMED provides a practical benchmark for temporally localized supervision in multimodal affective computing.
翻译:传统视频诱发生理数据集通常依赖整试次标签,这在动态情感识别中引入了时间标签噪声。我们提出FIRMED——一种基于即时回忆标注范式的峰值中心多模态数据集,采集了35名被试的同步脑电图、心电图、皮电反应、光电容积脉搏波与面部录像。该数据集提供事件中心时间戳、情感标签及强度标注,其标注质量经主观与生理验证支持。基准实验表明,FIRMED持续优于整试次标注方法,在八种基于EEG的分类器上平均提升3.8个百分点,并在多模态融合下实现进一步改进。该数据集为多模态情感计算中的时间局部化监督提供了实用基准。