The summarization of conversation, that is, discourse over discourse, elevates pragmatic considerations as a pervasive limitation of both summarization and other applications of contemporary conversational AI. Building on impressive progress in both semantics and syntax, pragmatics concerns meaning in the practical sense. In this paper, we discuss several challenges in both summarization of conversations and other conversational AI applications, drawing on relevant theoretical work. We illustrate the importance of pragmatics with so-called star sentences, syntactically acceptable propositions that are pragmatically inappropriate in conversation or its summary. Because the baseline for quality of AI is indistinguishability from human behavior, we draw heavily on the psycho-linguistics literature, and label our complaints as "Turing Test Triggers" (TTTs). We discuss implications for the design and evaluation of conversation summarization methods and conversational AI applications like voice assistants and chatbots
翻译:对话摘要——即对话之上的对话——将语用考量凸显为当代会话式人工智能在摘要生成及其他应用中的普遍局限。在语义与句法领域已取得显著进展的基础上,语用学关注语言在实际使用中的意义。本文借助相关理论研究成果,探讨了对话摘要及其他会话式人工智能应用面临的若干挑战。我们通过所谓"星号句"(即语法上可接受但在对话或其摘要中语用不当的命题)阐明了语用学的重要性。由于人工智能质量的基准在于与人类行为不可区分,我们大量借鉴心理语言学文献,将所提出的质疑称为"图灵测试触发项"。本文最后讨论了这些发现对对话摘要方法及语音助手、聊天机器人等会话式人工智能应用的设计与评估的启示。