Dialect classification is used in a variety of applications, such as machine translation and speech recognition, to improve the overall performance of the system. In a real-world scenario, a deployed dialect classification model can encounter anomalous inputs that differ from the training data distribution, also called out-of-distribution (OOD) samples. Those OOD samples can lead to unexpected outputs, as dialects of those samples are unseen during model training. Out-of-distribution detection is a new research area that has received little attention in the context of dialect classification. Towards this, we proposed a simple yet effective unsupervised Mahalanobis distance feature-based method to detect out-of-distribution samples. We utilize the latent embeddings from all intermediate layers of a wav2vec 2.0 transformer-based dialect classifier model for multi-task learning. Our proposed approach outperforms other state-of-the-art OOD detection methods significantly.
翻译:方言分类被广泛应用于机器翻译、语音识别等多种场景中,以提升系统的整体性能。在实际应用场景中,已部署的方言分类模型可能会遇到与训练数据分布不同的异常输入,这些输入被称为分布外(OOD)样本。由于这些样本的方言在模型训练期间未被见过,它们可能导致模型产生意外的输出结果。分布外检测是一个新兴的研究领域,在方言分类的背景下尚未受到足够关注。针对这一问题,我们提出了一种简单而有效的基于无监督马氏距离特征的方法,用于检测分布外样本。我们利用基于wav2vec 2.0 Transformer的方言分类模型所有中间层获取的潜在嵌入,进行多任务学习。所提出的方法显著优于其他先进的分布外检测方法。