Dialogue State Tracking (DST) models often employ intricate neural network architectures, necessitating substantial training data, and their inference processes lack transparency. This paper proposes a method that extracts linguistic knowledge via an unsupervised framework and subsequently utilizes this knowledge to augment BERT's performance and interpretability in DST tasks. The knowledge extraction procedure is computationally economical and does not necessitate annotations or additional training data. The injection of the extracted knowledge necessitates the addition of only simple neural modules. We employ the Convex Polytopic Model (CPM) as a feature extraction tool for DST tasks and illustrate that the acquired features correlate with the syntactic and semantic patterns in the dialogues. This correlation facilitates a comprehensive understanding of the linguistic features influencing the DST model's decision-making process. We benchmark this framework on various DST tasks and observe a notable improvement in accuracy.
翻译:对话状态跟踪(DST)模型常采用复杂的神经网络架构,需要大量训练数据,且其推理过程缺乏透明性。本文提出一种方法,通过无监督框架提取语言学知识,并利用该知识提升BERT在DST任务中的性能与可解释性。该知识提取过程计算成本低,无需标注或额外训练数据。注入提取的知识仅需添加简单神经模块。我们采用凸多面体模型(CPM)作为DST任务的特征提取工具,并阐明所获取的特征与对话中的句法及语义模式相关。这一关联有助于全面理解影响DST模型决策过程的语言学特征。我们在多种DST任务上对该框架进行基准测试,观察到准确率的显著提升。