In this work, we introduce several schemes to leverage description-augmented embedding similarity for dataless intent classification using current state-of-the-art (SOTA) text embedding models. We report results of our methods on four commonly used intent classification datasets and compare against previous works of a similar nature. Our work shows promising results for dataless classification scaling to a large number of unseen intents. We show competitive results and significant improvements (+6.12\% Avg.) over strong zero-shot baselines, all without training on labelled or task-specific data. Furthermore, we provide qualitative error analysis of the shortfalls of this methodology to help guide future research in this area.
翻译:在本研究中,我们提出了多种方案,利用当前最先进的文本嵌入模型,通过描述增强的嵌入相似性实现无数据意图分类。我们在四个常用意图分类数据集上报告了所提方法的结果,并与先前类似性质的研究进行了对比。我们的工作表明,无数据分类方法在扩展到大量未见意图时具有良好前景。在未使用标注数据或任务特定数据进行训练的情况下,我们取得了具有竞争力的结果,并相对于强零样本基线实现了显著提升(平均+6.12%)。此外,我们对该方法的不足进行了定性误差分析,以期为该领域的未来研究提供指引。