Sheaves are mathematical objects consisting of a base which constitutes a topological space and the data associated with each open set thereof, e.g. continuous functions defined on the open sets. Sheaves have originally been used in algebraic topology and logic. Recently, they have also modelled events such as physical experiments and natural language disambiguation processes. We extend the latter models from lexical ambiguities to discourse ambiguities arising from anaphora. To begin, we calculated a new measure of contextuality for a dataset of basic anaphoric discourses, resulting in a higher proportion of contextual models--82.9%--compared to previous work which only yielded 3.17% contextual models. Then, we show how an extension of the natural language processing challenge, known as the Winograd Schema, which involves anaphoric ambiguities can be modelled on the Bell-CHSH scenario with a contextual fraction of 0.096.
翻译:层是数学对象,由构成拓扑空间的基及其上每个开集所关联的数据(如定义在开集上的连续函数)组成。层最初在代数拓扑和逻辑学中得到应用,近期还被用于模拟物理实验和自然语言消歧过程等事件。我们将后者从词汇歧义模型扩展至由照应引发的语篇歧义模型。首先,我们针对基本照应语篇数据集计算了新的语境性度量,得到82.9%的高比例语境模型——相比之下,此前研究仅产生3.17%的语境模型。进而,我们展示了如何将涉及照应歧义的自然语言处理挑战——即著名的Winograd模式——建模为Bell-CHSH场景,其语境性分数为0.096。