In this work, we aim to establish a Bayesian adaptive learning framework by focusing on estimating latent variables in deep neural network (DNN) models. Latent variables indeed encode both transferable distributional information and structural relationships. Thus the distributions of the source latent variables (prior) can be combined with the knowledge learned from the target data (likelihood) to yield the distributions of the target latent variables (posterior) with the goal of addressing acoustic mismatches between training and testing conditions. The prior knowledge transfer is accomplished through Variational Bayes (VB). In addition, we also investigate Maximum a Posteriori (MAP) based Bayesian adaptation. Experimental results on device adaptation in acoustic scene classification show that our proposed approaches can obtain good improvements on target devices, and consistently outperforms other cut-edging algorithms.
翻译:本文旨在通过聚焦于深度神经网络(DNN)模型中的潜变量估计,建立一种贝叶斯自适应学习框架。潜变量同时编码了可迁移的分布信息与结构关系。因此,源域潜变量的分布(先验)可与从目标域数据中习得的知识(似然)相结合,生成目标域潜变量的分布(后验),以解决训练与测试条件之间的声学失配问题。先验知识迁移通过变分贝叶斯(VB)方法实现。此外,本文还研究了基于最大后验(MAP)的贝叶斯自适应方法。在声学场景分类的设备自适应实验结果表明,所提出的方法能够显著提升目标设备的性能,并持续优于其他前沿算法。