Intelligent systems that aim at mastering language as humans do must deal with its semantic underspecification, namely, the possibility for a linguistic signal to convey only part of the information needed for communication to succeed. Consider the usages of the pronoun they, which can leave the gender and number of its referent(s) underspecified. Semantic underspecification is not a bug but a crucial language feature that boosts its storage and processing efficiency. Indeed, human speakers can quickly and effortlessly integrate semantically-underspecified linguistic signals with a wide range of non-linguistic information, e.g., the multimodal context, social or cultural conventions, and shared knowledge. Standard NLP models have, in principle, no or limited access to such extra information, while multimodal systems grounding language into other modalities, such as vision, are naturally equipped to account for this phenomenon. However, we show that they struggle with it, which could negatively affect their performance and lead to harmful consequences when used for applications. In this position paper, we argue that our community should be aware of semantic underspecification if it aims to develop language technology that can successfully interact with human users. We discuss some applications where mastering it is crucial and outline a few directions toward achieving this goal.
翻译:致力于像人类一样掌握语言的智能系统必须处理语义不明确性问题,即语言信号可能仅传递交流成功所需的部分信息。例如,代词"they"的用法可能使其指代对象的性别和数量信息不明确。语义不明确性并非缺陷,而是提升语言存储与处理效率的关键特征。人类说话者能够快速且毫不费力地将语义不明确的语言信号与广泛的非语言信息(如多模态语境、社会文化惯例及共享知识)进行整合。传统自然语言处理模型原则上无法或仅能有限地获取此类额外信息,而将语言锚定于其他模态(如视觉)的多模态系统则天然具备处理这一现象的能力。然而,我们表明这些系统在面对语义不明确性时存在困难,这可能会对它们的性能产生负面影响,并在实际应用中导致有害后果。在这篇立场论文中,我们认为,如果语言技术研究者旨在开发能与人类用户成功交互的系统,就必须关注语义不明确性问题。我们讨论了掌握这一能力至关重要的若干应用场景,并概述了实现该目标的几个研究方向。