Task-oriented dialogue (TOD) systems are mainly based on the slot-filling-based TOD (SF-TOD) framework, in which dialogues are broken down into smaller, controllable units (i.e., slots) to fulfill a specific task. A series of approaches based on this framework achieved remarkable success on various TOD benchmarks. However, we argue that the current TOD benchmarks are limited to surrogate real-world scenarios and that the current TOD models are still a long way to cover the scenarios. In this position paper, we first identify current status and limitations of SF-TOD systems. After that, we explore the WebTOD framework, the alternative direction for building a scalable TOD system when a web/mobile interface is available. In WebTOD, the dialogue system learns how to understand the web/mobile interface that the human agent interacts with, powered by a large-scale language model.
翻译:面向任务型对话(TOD)系统主要基于槽填充的TOD框架(SF-TOD),在该框架中,对话被分解为更小、可控的单元(即槽位)以完成特定任务。基于该框架的一系列方法在各种TOD基准测试中取得了显著成功。然而,我们认为当前TOD基准测试局限于模拟真实场景,且现有TOD模型距离覆盖实际场景仍有很大差距。在本立场论文中,我们首先厘清SF-TOD系统的现状与局限性。随后,我们探讨了WebTOD框架——一种在具备网络/移动界面时可构建可扩展TOD系统的替代方向。在WebTOD中,对话系统借助大规模语言模型学习理解人类代理所交互的网络/移动界面。