Parameter-Efficient Fine-Tuning (PEFT) is increasingly recognized as an effective method in speech processing. However, the optimal approach and the placement of PEFT methods remain inconclusive. Our study conducts extensive experiments to compare different PEFT methods and their layer-wise placement adapting Differentiable Architecture Search (DARTS). We also explore the use of ensemble learning to leverage diverse PEFT strategies. The results reveal that DARTS does not outperform the baseline approach, which involves inserting the same PEFT method into all layers of a Self-Supervised Learning (SSL) model. In contrast, an ensemble learning approach, particularly one employing majority voting, demonstrates superior performance. Our statistical evidence indicates that different PEFT methods learn in varied ways. This variation might explain why the synergistic integration of various PEFT methods through ensemble learning can harness their unique learning capabilities more effectively compared to individual layer-wise optimization.
翻译:参数高效微调(PEFT)在语音处理中日益被视为一种有效方法。然而,PEFT方法的最优方案及其部署策略仍无定论。本研究通过广泛实验,对比不同PEFT方法及其基于可微架构搜索(DARTS)的层级部署方案。我们还探索利用集成学习来融合多种PEFT策略。结果表明,DARTS并未超越基线方法——即将相同PEFT方法插入自监督学习(SSL)模型的所有层。相比之下,集成学习方法(尤其是采用多数投票策略)展现出更优性能。统计证据显示,不同PEFT方法的学习路径存在差异。这一变异性或可解释:相较于逐层优化,通过集成学习实现多种PEFT方法的协同整合,能更有效地发挥其独特的学习能力。