This study marks a significant advancement by harnessing Large Language Models (LLMs) for multi-intent spoken language understanding (SLU), proposing a unique methodology that capitalizes on the generative power of LLMs within an SLU context. Our innovative technique reconfigures entity slots specifically for LLM application in multi-intent SLU environments and introduces the concept of Sub-Intent Instruction (SII), enhancing the dissection and interpretation of intricate, multi-intent communication within varied domains. The resultant datasets, dubbed LM-MixATIS and LM-MixSNIPS, are crafted from pre-existing benchmarks. Our research illustrates that LLMs can match and potentially excel beyond the capabilities of current state-of-the-art multi-intent SLU models. It further explores LLM efficacy across various intent configurations and dataset proportions. Moreover, we introduce two pioneering metrics, Entity Slot Accuracy (ESA) and Combined Semantic Accuracy (CSA), to provide an in-depth analysis of LLM proficiency in this complex field.
翻译:本研究通过利用大型语言模型(LLMs)进行多意图口语理解(SLU),提出了一种独特的方法,充分发挥LLMs在SLU场景中的生成能力,取得了显著进展。我们的创新技术专门针对多意图SLU环境中的LLM应用重新配置了实体槽位,并引入了子意图指令(SII)概念,从而增强了对跨领域复杂多意图交流的解析与理解能力。由此生成的数据集命名为LM-MixATIS和LM-MixSNIPS,基于现有基准构建。研究表明,LLMs能够匹配甚至超越当前最先进的多意图SLU模型的能力。此外,我们探索了LLMs在不同意图配置和数据集比例下的效能。进一步地,我们引入了两项开创性指标——实体槽位准确率(ESA)和组合语义准确率(CSA),以深入分析LLMs在这一复杂领域的表现。