FE-Adapter: Adapting Image-based Emotion Classifiers to Videos

Utilizing large pre-trained models for specific tasks has yielded impressive results. However, fully fine-tuning these increasingly large models is becoming prohibitively resource-intensive. This has led to a focus on more parameter-efficient transfer learning, primarily within the same modality. But this approach has limitations, particularly in video understanding where suitable pre-trained models are less common. Addressing this, our study introduces a novel cross-modality transfer learning approach from images to videos, which we call parameter-efficient image-to-video transfer learning. We present the Facial-Emotion Adapter (FE-Adapter), designed for efficient fine-tuning in video tasks. This adapter allows pre-trained image models, which traditionally lack temporal processing capabilities, to analyze dynamic video content efficiently. Notably, it uses about 15 times fewer parameters than previous methods, while improving accuracy. Our experiments in video emotion recognition demonstrate that the FE-Adapter can match or even surpass existing fine-tuning and video emotion models in both performance and efficiency. This breakthrough highlights the potential for cross-modality approaches in enhancing the capabilities of AI models, particularly in fields like video emotion analysis where the demand for efficiency and accuracy is constantly rising.

翻译：利用大型预训练模型处理特定任务已取得显著成果。然而，对这些日益庞大的模型进行全参数微调正变得资源消耗过高。这促使研究重点转向参数效率更高的迁移学习方法，但现有工作主要局限于单一模态内部。此类方法存在局限性，尤其在视频理解领域，合适的预训练模型较为稀缺。针对这一问题，本研究提出一种新颖的跨模态图像到视频迁移学习方法，称为参数高效的图像-视频迁移学习。我们设计了面部情感适配器（FE-Adapter），专为视频任务的高效微调而构建。该适配器使原本缺乏时序处理能力的预训练图像模型能够高效分析动态视频内容。值得注意的是，其参数量仅为传统方法的约1/15，同时提升了识别精度。在视频情感识别任务上的实验表明，FE-Adapter在性能与效率方面均达到甚至超越了现有微调方法与视频情感模型。这一突破凸显了跨模态方法在增强AI模型能力方面的潜力，特别是在视频情感分析等对效率与精度要求不断提升的领域。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日