This paper presents a Soft Labeling and Noisy Mixup-based open intent classification model (SNOiC). Most of the previous works have used threshold-based methods to identify open intents, which are prone to overfitting and may produce biased predictions. Additionally, the need for more available data for an open intent class presents another limitation for these existing models. SNOiC combines Soft Labeling and Noisy Mixup strategies to reduce the biasing and generate pseudo-data for open intent class. The experimental results on four benchmark datasets show that the SNOiC model achieves a minimum and maximum performance of 68.72\% and 94.71\%, respectively, in identifying open intents. Moreover, compared to state-of-the-art models, the SNOiC model improves the performance of identifying open intents by 0.93\% (minimum) and 12.76\% (maximum). The model's efficacy is further established by analyzing various parameters used in the proposed model. An ablation study is also conducted, which involves creating three model variants to validate the effectiveness of the SNOiC model.
翻译:本文提出了一种基于软标签和噪声混合的开放意图分类模型(SNOiC)。以往的研究大多采用基于阈值的方法来识别开放意图,这些方法容易过拟合,并可能产生偏差预测。此外,开放意图类别可用数据的不足构成了现有模型的另一项局限。SNOiC结合了软标签与噪声混合策略,以降低偏差并为开放意图类别生成伪数据。在四个基准数据集上的实验结果表明,SNOiC模型在识别开放意图方面的性能最低为68.72%,最高为94.71%。此外,与现有最优模型相比,SNOiC模型在识别开放意图上的性能提升了0.93%(最低提升)至12.76%(最高提升)。通过分析模型所采用的各种参数,进一步验证了该模型的有效性。同时,本文还进行了消融研究,通过构建三种模型变体来验证SNOiC模型的有效性。