Sheaves are mathematical objects consisting of a base which constitutes a topological space and the data associated with each open set thereof, e.g. continuous functions defined on the open sets. Sheaves have originally been used in algebraic topology and logic. Recently, they have also modelled events such as physical experiments and natural language disambiguation processes. We extend the latter models from lexical ambiguities to discourse ambiguities arising from anaphora. To begin, we calculated a new measure of contextuality for a dataset of basic anaphoric discourses, resulting in a higher proportion of contextual models-82.9%-compared to previous work which only yielded 3.17% contextual models. Then, we show how an extension of the natural language processing challenge, known as the Winograd Schema, which involves anaphoric ambiguities can be modelled on the Bell-CHSH scenario with a contextual fraction of 0.096.
翻译:层是一种数学对象,由构成拓扑空间的基以及与其每个开集相关联的数据(例如定义在开集上的连续函数)组成。层最初应用于代数拓扑和逻辑领域。近年来,它们也被用于建模物理实验和自然语言消歧过程等事件。我们将后一类模型从词汇歧义扩展到由指代引起的语篇歧义。首先,我们针对基础指代语篇数据集计算了一种新的语境性度量,结果显示语境模型的比例达到82.9%,远高于先前研究中仅3.17%的语境模型比例。随后,我们展示了如何将涉及指代歧义的自然语言处理挑战——即Winograd Schema——扩展建模为Bell-CHSH场景,其语境分数为0.096。