The capabilities and use cases of automatic natural language processing (NLP) have grown significantly over the last few years. While much work has been devoted to understanding how humans deal with discourse connectives, this phenomenon is understudied in computational systems. Therefore, it is important to put NLP models under the microscope and examine whether they can adequately comprehend, process, and reason within the complexity of natural language. In this chapter, we introduce the main mechanisms behind automatic sentence processing systems step by step and then focus on evaluating discourse connective processing. We assess nine popular systems in their ability to understand English discourse connectives and analyze how context and language understanding tasks affect their connective comprehension. The results show that NLP systems do not process all discourse connectives equally well and that the computational processing complexity of different connective kinds is not always consistently in line with the presumed complexity order found in human processing. In addition, while humans are more inclined to be influenced during the reading procedure but not necessarily in the final comprehension performance, discourse connectives have a significant impact on the final accuracy of NLP systems. The richer knowledge of connectives a system learns, the more negative effect inappropriate connectives have on it. This suggests that the correct explicitation of discourse connectives is important for computational natural language processing.
翻译:自动自然语言处理的能力与应用场景在过去几年中显著增长。尽管大量研究致力于理解人类如何处理话语连接词,但这一现象在计算系统中仍未被充分探究。因此,有必要深入审视自然语言处理模型,检验它们是否能够充分理解、处理并推理自然语言的复杂性。本章将逐步介绍自动语句处理系统的主要机制,并重点评估话语连接词的处理能力。我们评估了九个主流系统理解英语话语连接词的能力,并分析了上下文和语言理解任务如何影响它们对连接词的认知。结果表明,自然语言处理系统并非对所有话语连接词的处理能力均等,且不同类型连接词的计算处理复杂度并不总是与人类处理中假定的复杂度顺序一致。此外,尽管人类在阅读过程中更易受影响,但最终理解表现未必因此改变,而话语连接词却对自然语言处理系统的最终准确性产生显著影响。系统学到的连接词知识越丰富,不恰当连接词对其造成的负面影响就越大。这表明,正确显化话语连接词对计算自然语言处理至关重要。