人机对话多模态共同基础标注 (Human-Robot Dialogue Annotation for Multi-Modal Common Ground)

Claire Bonial,Stephanie M. Lukin,Mitchell Abrams,Anthony Baker,Lucia Donatelli,Ashley Foots,Cory J. Hayes,Cassidy Henry,Taylor Hudson,Matthew Marge,Kimberly A. Pollard,Ron Artstein,David Traum,Clare R. Voss

from arxiv, 52 pages, 14 figures

In this paper, we describe the development of symbolic representations annotated on human-robot dialogue data to make dimensions of meaning accessible to autonomous systems participating in collaborative, natural language dialogue, and to enable common ground with human partners. A particular challenge for establishing common ground arises in remote dialogue (occurring in disaster relief or search-and-rescue tasks), where a human and robot are engaged in a joint navigation and exploration task of an unfamiliar environment, but where the robot cannot immediately share high quality visual information due to limited communication constraints. Engaging in a dialogue provides an effective way to communicate, while on-demand or lower-quality visual information can be supplemented for establishing common ground. Within this paradigm, we capture propositional semantics and the illocutionary force of a single utterance within the dialogue through our Dialogue-AMR annotation, an augmentation of Abstract Meaning Representation. We then capture patterns in how different utterances within and across speaker floors relate to one another in our development of a multi-floor Dialogue Structure annotation schema. Finally, we begin to annotate and analyze the ways in which the visual modalities provide contextual information to the dialogue for overcoming disparities in the collaborators' understanding of the environment. We conclude by discussing the use-cases, architectures, and systems we have implemented from our annotations that enable physical robots to autonomously engage with humans in bi-directional dialogue and navigation.

翻译：本文描述了基于人机对话数据开发的符号化表示标注方法，旨在使参与协作式自然语言对话的自主系统能够获取语义维度信息，并与人类伙伴建立共同基础。在远程对话场景（如灾难救援或搜救任务）中建立共同基础面临特殊挑战：人类与机器人在陌生环境中执行联合导航与探索任务时，由于通信条件受限，机器人无法即时共享高质量视觉信息。开展对话提供了有效的沟通途径，同时可按需补充低质量视觉信息以建立共同基础。在此范式下，我们通过对话抽象语义表示标注（基于抽象语义表示的扩展框架）捕捉对话中单句话语的命题语义与言外之力。进而通过开发的多话轮对话结构标注体系，揭示话轮内外不同话语间的关联模式。最后，我们开始标注分析视觉模态如何为对话提供情境信息，以消除协作者对环境理解的认知差异。文末讨论了基于标注成果实现的应用场景、架构与系统，这些成果使实体机器人能够自主参与双向对话与协同导航。