Transformer-based Self-supervised Multimodal Representation Learning for Wearable Emotion Recognition - 专知论文

会员服务 ·

0

情感识别 · 穿戴式 · SSL · 模态 · 多模 ·

2023 年 3 月 29 日

Transformer-based Self-supervised Multimodal Representation Learning for Wearable Emotion Recognition

翻译：基于Transformer的可穿戴情感识别自监督多模态表示学习

Yujin Wu,Mohamed Daoudi,Ali Amad

from arxiv, Accepted IEEE Transactions On Affective Computing

Recently, wearable emotion recognition based on peripheral physiological signals has drawn massive attention due to its less invasive nature and its applicability in real-life scenarios. However, how to effectively fuse multimodal data remains a challenging problem. Moreover, traditional fully-supervised based approaches suffer from overfitting given limited labeled data. To address the above issues, we propose a novel self-supervised learning (SSL) framework for wearable emotion recognition, where efficient multimodal fusion is realized with temporal convolution-based modality-specific encoders and a transformer-based shared encoder, capturing both intra-modal and inter-modal correlations. Extensive unlabeled data is automatically assigned labels by five signal transforms, and the proposed SSL model is pre-trained with signal transformation recognition as a pretext task, allowing the extraction of generalized multimodal representations for emotion-related downstream tasks. For evaluation, the proposed SSL model was first pre-trained on a large-scale self-collected physiological dataset and the resulting encoder was subsequently frozen or fine-tuned on three public supervised emotion recognition datasets. Ultimately, our SSL-based method achieved state-of-the-art results in various emotion classification tasks. Meanwhile, the proposed model proved to be more accurate and robust compared to fully-supervised methods on low data regimes.

翻译：近年来，基于外周生理信号的可穿戴情感识别因其低侵入性和在现实场景中的适用性而受到广泛关注。然而，如何有效融合多模态数据仍然是一个具有挑战性的问题。此外，传统的全监督方法在标注数据有限的情况下容易出现过度拟合。为解决上述问题，我们提出了一种用于可穿戴情感识别的新型自监督学习框架，该框架通过基于时间卷积的模态特定编码器和基于Transformer的共享编码器实现高效多模态融合，捕获模态内和模态间的相关性。通过五种信号变换自动为大量未标注数据分配标签，并以信号变换识别作为预文本任务对所提出的自监督学习模型进行预训练，从而提取用于情感相关下游任务的通用多模态表示。为进行评估，我们首先在一个大规模自采集生理数据集上预训练所提出的自监督学习模型，随后将得到的编码器在三个公开监督情感识别数据集上进行冻结或微调。最终，我们的自监督学习方法在各类情感分类任务中取得了最先进的性能。同时，在低数据量条件下，所提出的模型相比全监督方法表现出更高的准确性和鲁棒性。

0

相关内容

情感识别

计算机对从传感器采集来的信号进行分析和处理，从而得出对方（人）正处在的情感状态，这种行为叫做情感识别。

视频自监督学习综述

视频自监督学习综述

专知会员服务

53+阅读 · 2022年7月5日

【CVPR 2022】一种无需使用负样本的自监督学习方法，Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes

【CVPR 2022】一种无需使用负样本的自监督学习方法，Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes

专知会员服务

15+阅读 · 2022年3月12日

【CVPR 2022】【视频检索用多模态融合Transformer】Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval

【CVPR 2022】【视频检索用多模态融合Transformer】Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval

专知会员服务

29+阅读 · 2022年3月6日

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

专知会员服务

27+阅读 · 2022年3月3日

【NeurIPS 2021 】MST: 用于Transformer视觉表征的Masked自监督解读

【NeurIPS 2021 】MST: 用于Transformer视觉表征的Masked自监督解读

专知会员服务

42+阅读 · 2021年12月11日

自监督学习最新研究进展

自监督学习最新研究进展

专知会员服务

77+阅读 · 2021年3月24日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

【AAAI2020-Oral】自监督时空学习的视频完形程序，Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning

【AAAI2020-Oral】自监督时空学习的视频完形程序，Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning

专知会员服务

30+阅读 · 2020年1月2日

最新10篇对比学习推荐前沿工作

最新10篇对比学习推荐前沿工作

机器学习与推荐算法

2+阅读 · 2022年9月14日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

解读自监督学习(Self-Supervised Learning)几篇相关paper

解读自监督学习(Self-Supervised Learning)几篇相关paper

CVer

25+阅读 · 2020年2月21日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

笔记 | Deep active learning for named entity recognition

笔记 | Deep active learning for named entity recognition

黑龙江大学自然语言处理实验室

24+阅读 · 2018年5月27日

基于健康数据分析的半监督在线学习血糖预报建模算法研究

国家自然科学基金

2+阅读 · 2015年12月31日

结合图像块联合聚类加权和混合分类器的非对齐稀疏表示识别方法

国家自然科学基金

1+阅读 · 2015年12月31日

基于眼动追踪的个性化图像推荐研究

国家自然科学基金

5+阅读 · 2014年12月31日

Resveratrol联合MSCs移植对阿尔茨海默鼠的干预效果及Sirt1分子信号的介导作用

国家自然科学基金

0+阅读 · 2014年12月31日

基于多模态情感识别的人机交流氛围场建模方法

国家自然科学基金

5+阅读 · 2013年12月31日

面向大数据的机器学习理论与方法

国家自然科学基金

4+阅读 · 2013年12月31日

半监督半配对高维多表示数据的降维及拓展研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于注意力的情感脑机接口研究与示范应用

国家自然科学基金

4+阅读 · 2010年12月31日

融合噪声处理及不完备模态测试的海洋平台结构损伤检测

国家自然科学基金

0+阅读 · 2009年12月31日

音视融合的韵律模式的个性化研究

国家自然科学基金

0+阅读 · 2008年12月31日

Efficient Multimodal Transformer with Dual-Level Feature Restoration for Robust Multimodal Sentiment Analysis

Arxiv

0+阅读 · 2023年5月22日

VATLM: Visual-Audio-Text Pre-Training with Unified Masked Prediction for Speech Representation Learning

Arxiv

0+阅读 · 2023年5月19日

Multimodal Prompting with Missing Modalities for Visual Recognition

Arxiv

11+阅读 · 2023年3月6日

A Survey on Masked Autoencoder for Self-supervised Learning in Vision and Beyond

Arxiv

10+阅读 · 2022年7月30日

Multimodal Learning with Transformers: A Survey

Arxiv

69+阅读 · 2022年6月13日

Hybrid Curriculum Learning for Emotion Recognition in Conversation

Arxiv

14+阅读 · 2021年12月22日

Graph Self-Supervised Learning: A Survey

Arxiv

15+阅读 · 2021年8月5日

SiT: Self-supervised vIsion Transformer

Arxiv

19+阅读 · 2021年4月8日

Self-correcting Q-Learning

Arxiv

11+阅读 · 2020年12月2日

Meta-Transfer Learning for Zero-Shot Super-Resolution

Meta-Transfer Learning for Zero-Shot Super-Resolution

Arxiv

43+阅读 · 2020年2月27日

VIP会员

文章信息

相关主题

最新内容

DARPA拟打造十万规模自主思考作战的AI智能体集群：“受控涌现式分布式人工智能”（DICE）项目

DARPA拟打造十万规模自主思考作战的AI智能体集群：“受控涌现式分布式人工智能”（DICE）项目

专知会员服务

4+阅读 · 7月17日

《边缘端实时无线感知赋能现场多机器人部署》200页

《边缘端实时无线感知赋能现场多机器人部署》200页

专知会员服务

5+阅读 · 7月17日

战力倍增器：自主武器系统与乌克兰及加沙冲突

战力倍增器：自主武器系统与乌克兰及加沙冲突

专知会员服务

4+阅读 · 7月17日

人工智能赋能战场情报：提速决策进程

人工智能赋能战场情报：提速决策进程

专知会员服务

2+阅读 · 7月17日

《拥抱新兴技术：面向未来军官的教育革新》

《拥抱新兴技术：面向未来军官的教育革新》

专知会员服务

5+阅读 · 7月17日

ACM MM 2026 | MAR-GRPO：稳定混合图像生成的强化学习训练

ACM MM 2026 | MAR-GRPO：稳定混合图像生成的强化学习训练

专知会员服务

2+阅读 · 7月17日

综述 | 大模型水印理论与部署：来源追踪、攻击鲁棒与可信治理

综述 | 大模型水印理论与部署：来源追踪、攻击鲁棒与可信治理

专知会员服务

3+阅读 · 7月17日

《火线上的后勤保障：对抗环境下的随机规划模型研究——俄乌场景案例分析》99页

《火线上的后勤保障：对抗环境下的随机规划模型研究——俄乌场景案例分析》99页

专知会员服务

11+阅读 · 7月16日

《无人地面战车（UGV）的崛起》报告

《无人地面战车（UGV）的崛起》报告

专知会员服务

7+阅读 · 7月16日

《无人机参数化与集群飞行创新项目的监控流程管理：模型、策略及自适应解决方案》

《无人机参数化与集群飞行创新项目的监控流程管理：模型、策略及自适应解决方案》

专知会员服务

6+阅读 · 7月16日

《美军开放式任务系统（OMS）定义与文档（D&D）——Java关键抽象层（CAL）接口生成规范》47页标准

《美军开放式任务系统（OMS）定义与文档（D&D）——Java关键抽象层（CAL）接口生成规范》47页标准

专知会员服务

13+阅读 · 7月16日

美陆军任务式指挥人工智能解决方案

美陆军任务式指挥人工智能解决方案

专知会员服务

13+阅读 · 7月16日

ICML 2026 | 理论级自动形式化：从孤立命题到统一形式化知识库

ICML 2026 | 理论级自动形式化：从孤立命题到统一形式化知识库

专知会员服务

9+阅读 · 7月16日

综述 | 现代智能体自我改进，从模型更新到脚手架演化

综述 | 现代智能体自我改进，从模型更新到脚手架演化

专知会员服务

16+阅读 · 7月16日

美国陆军宣布“项目融合-顶点6”：现代化进程的关键里程碑

美国陆军宣布“项目融合-顶点6”：现代化进程的关键里程碑

专知会员服务

13+阅读 · 7月15日

相关VIP内容

视频自监督学习综述

视频自监督学习综述

专知会员服务

53+阅读 · 2022年7月5日

【CVPR 2022】一种无需使用负样本的自监督学习方法，Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes

【CVPR 2022】一种无需使用负样本的自监督学习方法，Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes

专知会员服务

15+阅读 · 2022年3月12日

【CVPR 2022】【视频检索用多模态融合Transformer】Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval

【CVPR 2022】【视频检索用多模态融合Transformer】Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval

专知会员服务

29+阅读 · 2022年3月6日

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

专知会员服务

27+阅读 · 2022年3月3日

【NeurIPS 2021 】MST: 用于Transformer视觉表征的Masked自监督解读

【NeurIPS 2021 】MST: 用于Transformer视觉表征的Masked自监督解读

专知会员服务

42+阅读 · 2021年12月11日

自监督学习最新研究进展

自监督学习最新研究进展

专知会员服务

77+阅读 · 2021年3月24日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

【AAAI2020-Oral】自监督时空学习的视频完形程序，Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning

【AAAI2020-Oral】自监督时空学习的视频完形程序，Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning

专知会员服务

30+阅读 · 2020年1月2日

热门VIP内容

开通专知VIP会员享更多权益服务

《边缘端实时无线感知赋能现场多机器人部署》200页

人工智能赋能战场情报：提速决策进程

DARPA拟打造十万规模自主思考作战的AI智能体集群：“受控涌现式分布式人工智能”（DICE）项目

战力倍增器：自主武器系统与乌克兰及加沙冲突

相关资讯

最新10篇对比学习推荐前沿工作

最新10篇对比学习推荐前沿工作

机器学习与推荐算法

2+阅读 · 2022年9月14日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

解读自监督学习(Self-Supervised Learning)几篇相关paper

解读自监督学习(Self-Supervised Learning)几篇相关paper

CVer

25+阅读 · 2020年2月21日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

笔记 | Deep active learning for named entity recognition

笔记 | Deep active learning for named entity recognition

黑龙江大学自然语言处理实验室

24+阅读 · 2018年5月27日

相关论文

Efficient Multimodal Transformer with Dual-Level Feature Restoration for Robust Multimodal Sentiment Analysis

Arxiv

0+阅读 · 2023年5月22日

VATLM: Visual-Audio-Text Pre-Training with Unified Masked Prediction for Speech Representation Learning

Arxiv

0+阅读 · 2023年5月19日

Multimodal Prompting with Missing Modalities for Visual Recognition

Arxiv

11+阅读 · 2023年3月6日

A Survey on Masked Autoencoder for Self-supervised Learning in Vision and Beyond

Arxiv

10+阅读 · 2022年7月30日

Multimodal Learning with Transformers: A Survey

Arxiv

69+阅读 · 2022年6月13日

Hybrid Curriculum Learning for Emotion Recognition in Conversation

Arxiv

14+阅读 · 2021年12月22日

Graph Self-Supervised Learning: A Survey

Arxiv

15+阅读 · 2021年8月5日

SiT: Self-supervised vIsion Transformer

Arxiv

19+阅读 · 2021年4月8日

Self-correcting Q-Learning

Arxiv

11+阅读 · 2020年12月2日

Meta-Transfer Learning for Zero-Shot Super-Resolution

Meta-Transfer Learning for Zero-Shot Super-Resolution

Arxiv

43+阅读 · 2020年2月27日

相关基金

基于健康数据分析的半监督在线学习血糖预报建模算法研究

国家自然科学基金

2+阅读 · 2015年12月31日

结合图像块联合聚类加权和混合分类器的非对齐稀疏表示识别方法

国家自然科学基金

1+阅读 · 2015年12月31日

基于眼动追踪的个性化图像推荐研究

国家自然科学基金

5+阅读 · 2014年12月31日

Resveratrol联合MSCs移植对阿尔茨海默鼠的干预效果及Sirt1分子信号的介导作用

国家自然科学基金

0+阅读 · 2014年12月31日

基于多模态情感识别的人机交流氛围场建模方法

国家自然科学基金

5+阅读 · 2013年12月31日

面向大数据的机器学习理论与方法

国家自然科学基金

4+阅读 · 2013年12月31日

半监督半配对高维多表示数据的降维及拓展研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于注意力的情感脑机接口研究与示范应用

国家自然科学基金

4+阅读 · 2010年12月31日

融合噪声处理及不完备模态测试的海洋平台结构损伤检测

国家自然科学基金

0+阅读 · 2009年12月31日

音视融合的韵律模式的个性化研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员