This study marks a significant advancement by harnessing Large Language Models (LLMs) for multi-intent spoken language understanding (SLU), proposing a unique methodology that capitalizes on the generative power of LLMs within an SLU context. Our innovative technique reconfigures entity slots specifically for LLM application in multi-intent SLU environments and introduces the concept of Sub-Intent Instruction (SII), enhancing the dissection and interpretation of intricate, multi-intent communication within varied domains. The resultant datasets, dubbed LM-MixATIS and LM-MixSNIPS, are crafted from pre-existing benchmarks. Our research illustrates that LLMs can match and potentially excel beyond the capabilities of current state-of-the-art multi-intent SLU models. It further explores LLM efficacy across various intent configurations and dataset proportions. Moreover, we introduce two pioneering metrics, Entity Slot Accuracy (ESA) and Combined Semantic Accuracy (CSA), to provide an in-depth analysis of LLM proficiency in this complex field.
翻译:本研究通过利用大型语言模型(LLMs)进行多意图口语理解(SLU),取得了显著进展,提出了一种在SLU背景下充分发挥LLMs生成能力的独特方法。我们的创新技术专门针对多意图SLU环境中的LLM应用重新配置了实体槽位,并引入了子意图指令(SII)概念,从而增强了对跨领域复杂多意图交流的解构与解释能力。由此生成的数据集分别命名为LM-MixATIS和LM-MixSNIPS,它们基于现有基准构建。研究表明,LLMs能够匹配甚至超越当前最先进的多意图SLU模型的能力。此外,本文进一步探究了LLMs在不同意图配置和数据集比例下的表现。我们还引入了两项创新性评估指标——实体槽位准确率(ESA)和组合语义准确率(CSA),以深入分析LLMs在该复杂领域的掌握程度。