Re-Parameterization of Lightweight Transformer for On-Device Speech Emotion Recognition

With the increasing implementation of machine learning models on edge or Internet-of-Things (IoT) devices, deploying advanced models on resource-constrained IoT devices remains challenging. Transformer models, a currently dominant neural architecture, have achieved great success in broad domains but their complexity hinders its deployment on IoT devices with limited computation capability and storage size. Although many model compression approaches have been explored, they often suffer from notorious performance degradation. To address this issue, we introduce a new method, namely Transformer Re-parameterization, to boost the performance of lightweight Transformer models. It consists of two processes: the High-Rank Factorization (HRF) process in the training stage and the deHigh-Rank Factorization (deHRF) process in the inference stage. In the former process, we insert an additional linear layer before the Feed-Forward Network (FFN) of the lightweight Transformer. It is supposed that the inserted HRF layers can enhance the model learning capability. In the later process, the auxiliary HRF layer will be merged together with the following FFN layer into one linear layer and thus recover the original structure of the lightweight model. To examine the effectiveness of the proposed method, we evaluate it on three widely used Transformer variants, i.e., ConvTransformer, Conformer, and SpeechFormer networks, in the application of speech emotion recognition on the IEMOCAP, M3ED and DAIC-WOZ datasets. Experimental results show that our proposed method consistently improves the performance of lightweight Transformers, even making them comparable to large models. The proposed re-parameterization approach enables advanced Transformer models to be deployed on resource-constrained IoT devices.

翻译：随着机器学习模型在边缘或物联网设备上的应用日益增多，在资源受限的物联网设备上部署先进模型仍然具有挑战性。Transformer模型作为当前主流的神经架构，已在广泛领域取得巨大成功，但其复杂性阻碍了其在计算能力和存储容量有限的物联网设备上的部署。尽管已有许多模型压缩方法被探索，但它们往往存在显著的性能下降问题。为解决这一问题，我们提出了一种新方法，即Transformer再参数化，以提升轻量化Transformer模型的性能。该方法包含两个过程：训练阶段的高秩分解过程与推理阶段的去高秩分解过程。在前一过程中，我们在轻量化Transformer的前馈网络前插入一个额外的线性层。假设插入的HRF层能够增强模型的学习能力。在后一过程中，辅助HRF层将与后续的FFN层合并为一个线性层，从而恢复轻量化模型的原始结构。为检验所提方法的有效性，我们在IEMOCAP、M3ED和DAIC-WOZ数据集上的语音情感识别应用中，对三种广泛使用的Transformer变体（即ConvTransformer、Conformer和SpeechFormer网络）进行了评估。实验结果表明，我们提出的方法能持续提升轻量化Transformer的性能，甚至使其可与大型模型相媲美。所提出的再参数化方法使得先进Transformer模型能够部署在资源受限的物联网设备上。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日