With increasing demand for and adoption of virtual assistants, recent work has investigated ways to accelerate bot schema design through the automatic induction of intents or the induction of slots and dialogue states. However, a lack of dedicated benchmarks and standardized evaluation has made progress difficult to track and comparisons between systems difficult to make. This challenge track, held as part of the Eleventh Dialog Systems Technology Challenge, introduces a benchmark that aims to evaluate methods for the automatic induction of customer intents in a realistic setting of customer service interactions between human agents and customers. We propose two subtasks for progressively tackling the automatic induction of intents and corresponding evaluation methodologies. We then present three datasets suitable for evaluating the tasks and propose simple baselines. Finally, we summarize the submissions and results of the challenge track, for which we received submissions from 34 teams.
翻译:随着虚拟助手的需求与应用的日益增长,近期研究探索了通过自动归纳意图或槽位与对话状态来加速机器人模式设计的方法。然而,由于缺乏专用基准和标准化评估,进展追踪困难,系统间难以比较。本挑战赛道作为第十一届对话系统技术挑战赛的一部分,引入了一个基准测试,旨在评估在人工客服与客户之间的真实客服交互场景中自动归纳客户意图的方法。我们提出了两个子任务,以逐步推进意图的自动归纳及其对应的评估方法。随后,我们提供了三个适用于任务评估的数据集,并建立了简单基线。最后,我们总结了该挑战赛道的提交方案与结果,共收到来自34个团队的提交。