Ambiguity is a natural language phenomenon occurring at different levels of syntax, semantics, and pragmatics. It is widely studied; in Psycholinguistics, for instance, we have a variety of competing studies for the human disambiguation processes. These studies are empirical and based on eye-tracking measurements. Here we take first steps towards formalizing these processes for semantic ambiguities where we identified the presence of two features: (1) joint plausibility degrees of different possible interpretations, (2) causal structures according to which certain words play a more substantial role in the processes. The novel sheaf-theoretic model of definite causality developed by Gogioso and Pinzani in QPL 2021 offers tools to model and reason about these features. We applied this theory to a dataset of ambiguous phrases extracted from Psycholinguistics literature and their human plausibility judgements collected by us using the Amazon Mechanical Turk engine. We measured the causal fractions of different disambiguation orders within the phrases and discovered two prominent orders: from subject to verb in the subject-verb and from object to verb in the verb object phrases. We also found evidence for delay in the disambiguation of polysemous vs homonymous verbs, again compatible with Psycholinguistic findings.
翻译:歧义是一种自然语言现象,发生在句法、语义和语用等不同层面。该现象已被广泛研究,例如心理语言学中存在多种关于人类歧义消解过程的竞争性研究。这些实证研究基于眼动追踪测量。本文首次尝试形式化语义歧义的消解过程,识别出两个关键特征:(1)不同可能解释的联合似然度;(2)特定词汇在消解过程中起更重要作用的因果结构。Gogioso与Pinzani在QPL 2021提出的基于层论的确定因果模型,为建模和推理这些特征提供了工具。我们将该理论应用于从心理语言学文献中提取的歧义短语数据集,以及通过亚马逊土耳其机器人(Amazon Mechanical Turk)收集的人类似然度判断数据。通过测量短语中不同消解顺序的因果分数,发现两种主导模式:主语-动词短语中从主语到动词的消解,以及动词-宾语短语中从宾语到动词的消解。我们还发现多义动词与同音异义动词在消解延迟上的差异,该结果与心理语言学发现一致。