In this paper, we study the performance of few-shot learning, specifically meta learning empowered few-shot relation networks, over supervised deep learning and conventional machine learning approaches in the problem of Sound Source Distance Estimation (SSDE). In previous research on deep supervised SSDE, low accuracies have often resulted from the mismatch between the training data (from known environments) and the test data (from unknown environments). By performing comparative experiments on a sufficient amount of data, we show that the few-shot relation network outperforms other competitors including eXtreme Gradient Boosting (XGBoost), Support Vector Machine (SVM), Convolutional Neural Network (CNN), and MultiLayer Perceptron (MLP). Hence it is possible to calibrate a microphone-equipped system, with a few labeled samples of audio recorded in a particular unknown environment to adjust and generalize our classifier to the possible input data and gain higher accuracies.
翻译:本文研究了少样本学习(特别是元学习驱动的少样本关系网络)在声源距离估计问题中的性能表现,并将其与监督式深度学习及传统机器学习方法进行对比分析。在以往的深度监督式声源距离估计研究中,训练数据(来自已知环境)与测试数据(来自未知环境)之间的失配常导致较低精度。通过基于充足数据量的对比实验,我们证明少样本关系网络的性能优于极限梯度提升(XGBoost)、支持向量机(SVM)、卷积神经网络(CNN)及多层感知机(MLP)等竞争方法。因此,仅需在特定未知环境中记录少量带标签音频样本,即可校准配备麦克风的系统,使分类器适应可能的输入数据并提升精度。