Audio-Guided Fusion Techniques for Multimodal Emotion Analysis - 专知论文

会员服务 ·

0

多峰值 · Analysis · Learning · Performer · INFORMS ·

2024 年 9 月 8 日

Audio-Guided Fusion Techniques for Multimodal Emotion Analysis

翻译：音频引导的多模态情感分析融合技术

Pujin Shi,Fei Gao

In this paper, we propose a solution for the semi-supervised learning track (MER-SEMI) in MER2024. First, in order to enhance the performance of the feature extractor on sentiment classification tasks,we fine-tuned video and text feature extractors, specifically CLIP-vit-large and Baichuan-13B, using labeled data. This approach effectively preserves the original emotional information conveyed in the videos. Second, we propose an Audio-Guided Transformer (AGT) fusion mechanism, which leverages the robustness of Hubert-large, showing superior effectiveness in fusing both inter-channel and intra-channel information. Third, To enhance the accuracy of the model, we iteratively apply self-supervised learning by using high-confidence unlabeled data as pseudo-labels. Finally, through black-box probing, we discovered an imbalanced data distribution between the training and test sets. Therefore, We adopt a prior-knowledge-based voting mechanism. The results demonstrate the effectiveness of our strategy, ultimately earning us third place in the MER-SEMI track.

翻译：本文针对MER2024竞赛中的半监督学习赛道（MER-SEMI）提出了一套解决方案。首先，为提升特征提取器在情感分类任务上的性能，我们使用标注数据对视频与文本特征提取器（具体采用CLIP-vit-large与Baichuan-13B模型）进行了微调，该方法有效保留了视频中蕴含的原始情感信息。其次，我们提出了一种音频引导的Transformer（AGT）融合机制，该机制利用Hubert-large模型的鲁棒性，在融合通道间与通道内信息方面表现出卓越效果。第三，为提高模型精度，我们采用高置信度未标注数据作为伪标签，通过迭代方式进行自监督学习。最后，通过黑盒探测分析，我们发现训练集与测试集之间存在数据分布不均衡现象，因此采用了基于先验知识的投票机制。实验结果验证了我们策略的有效性，最终帮助我们在MER-SEMI赛道中获得了第三名的成绩。

0

相关内容

多峰值

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

From Softmax to Sparsemax-ICML16（1）

From Softmax to Sparsemax-ICML16（1）

KingsGarden

74+阅读 · 2016年11月26日

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

“Fishes-in-net” 酵母孢子微胶囊式近平滑假丝酵母SCRII酶有机相高效手性合成机制研究

国家自然科学基金

3+阅读 · 2016年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

高频ZnO/IDT/SiO2/金刚石SAW乳腺癌抗原免疫传感器研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

47+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

PPP项目争端谈判及其治理机制研究

国家自然科学基金

2+阅读 · 2015年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

“杰文斯”悖论、能效政策改进与“双控目标”分解

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

Conditional Prompt Learning for Vision-Language Models

Conditional Prompt Learning for Vision-Language Models

Arxiv

13+阅读 · 2022年3月10日

Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition

Arxiv

15+阅读 · 2021年4月12日

Personalized Cross-Silo Federated Learning on Non-IID Data

Personalized Cross-Silo Federated Learning on Non-IID Data

Arxiv

11+阅读 · 2021年1月7日

Meta Learning for End-to-End Low-Resource Speech Recognition

Meta Learning for End-to-End Low-Resource Speech Recognition

Arxiv

20+阅读 · 2019年10月26日

Fine-tune BERT for Extractive Summarization

Arxiv

21+阅读 · 2019年3月25日

Large Margin Few-Shot Learning

Arxiv

11+阅读 · 2018年7月8日

Reinforced Mnemonic Reader for Machine Reading Comprehension

Arxiv

10+阅读 · 2018年4月25日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

Additive Margin Softmax for Face Verification

Arxiv

11+阅读 · 2018年1月18日

DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning

Arxiv

20+阅读 · 2018年1月8日

VIP会员

文章信息

相关主题

最新内容

BES：让语言模型通过双向进化搜索自我改进

BES：让语言模型通过双向进化搜索自我改进

专知会员服务

3+阅读 · 5月30日

ICML 2026 | 揭开视觉语言模型计数瓶颈：看得到，却说不出

ICML 2026 | 揭开视觉语言模型计数瓶颈：看得到，却说不出

专知会员服务

3+阅读 · 5月30日

以色列-美国-伊朗战争中的无人机：关键要点

以色列-美国-伊朗战争中的无人机：关键要点

专知会员服务

4+阅读 · 5月30日

美以伊战争：首次人工智能战争——军事自主性困境

美以伊战争：首次人工智能战争——军事自主性困境

专知会员服务

4+阅读 · 5月30日

《Palantir任务保障性软件安全标准（MA-S2）》

《Palantir任务保障性软件安全标准（MA-S2）》

专知会员服务

10+阅读 · 5月30日

《美海军利用扩展现实增强知识流动研究》300页报告

《美海军利用扩展现实增强知识流动研究》300页报告

专知会员服务

6+阅读 · 5月30日

基于声学的无人机检测技术综述

基于声学的无人机检测技术综述

专知会员服务

7+阅读 · 5月30日

《当代混合战争分析框架：俄乌战争经验教训》

《当代混合战争分析框架：俄乌战争经验教训》

专知会员服务

8+阅读 · 5月30日

生成式AI基础小册子绪论解读：一条数学地基路线，178页pdf

生成式AI基础小册子绪论解读：一条数学地基路线，178页pdf

专知会员服务

11+阅读 · 5月29日

AutoScientists：自组织智能体团队驱动长期科学实验

AutoScientists：自组织智能体团队驱动长期科学实验

专知会员服务

6+阅读 · 5月29日

《阿利·伯克级驱逐舰的战损修理：桌面推演结果》报告

《阿利·伯克级驱逐舰的战损修理：桌面推演结果》报告

专知会员服务

6+阅读 · 5月29日

战略前沿人工智能的再思考（中文）

战略前沿人工智能的再思考（中文）

专知会员服务

8+阅读 · 5月29日

《量化地基防空系统间接效应的博弈论方法》

《量化地基防空系统间接效应的博弈论方法》

专知会员服务

6+阅读 · 5月29日

传感器网络：美国如何探测来自伊朗的导弹与无人机

传感器网络：美国如何探测来自伊朗的导弹与无人机

专知会员服务

6+阅读 · 5月29日

《无人机战争中的经济不对称：伊朗“沙赫德-136”对抗以色列“铁穹”防御系统的案例研究》

《无人机战争中的经济不对称：伊朗“沙赫德-136”对抗以色列“铁穹”防御系统的案例研究》

专知会员服务

9+阅读 · 5月29日

相关VIP内容

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

ICML 2026 | 揭开视觉语言模型计数瓶颈：看得到，却说不出

美以伊战争：首次人工智能战争——军事自主性困境

BES：让语言模型通过双向进化搜索自我改进

以色列-美国-伊朗战争中的无人机：关键要点

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

From Softmax to Sparsemax-ICML16（1）

From Softmax to Sparsemax-ICML16（1）

KingsGarden

74+阅读 · 2016年11月26日

相关论文

Conditional Prompt Learning for Vision-Language Models

Conditional Prompt Learning for Vision-Language Models

Arxiv

13+阅读 · 2022年3月10日

Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition

Arxiv

15+阅读 · 2021年4月12日

Personalized Cross-Silo Federated Learning on Non-IID Data

Personalized Cross-Silo Federated Learning on Non-IID Data

Arxiv

11+阅读 · 2021年1月7日

Meta Learning for End-to-End Low-Resource Speech Recognition

Meta Learning for End-to-End Low-Resource Speech Recognition

Arxiv

20+阅读 · 2019年10月26日

Fine-tune BERT for Extractive Summarization

Arxiv

21+阅读 · 2019年3月25日

Large Margin Few-Shot Learning

Arxiv

11+阅读 · 2018年7月8日

Reinforced Mnemonic Reader for Machine Reading Comprehension

Arxiv

10+阅读 · 2018年4月25日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

Additive Margin Softmax for Face Verification

Arxiv

11+阅读 · 2018年1月18日

DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning

Arxiv

20+阅读 · 2018年1月8日

相关基金

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

“Fishes-in-net” 酵母孢子微胶囊式近平滑假丝酵母SCRII酶有机相高效手性合成机制研究

国家自然科学基金

3+阅读 · 2016年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

高频ZnO/IDT/SiO2/金刚石SAW乳腺癌抗原免疫传感器研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

47+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

PPP项目争端谈判及其治理机制研究

国家自然科学基金

2+阅读 · 2015年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

“杰文斯”悖论、能效政策改进与“双控目标”分解

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员