Distributional semantics offers new ways to study the semantics of morphology. This study focuses on the semantics of noun singulars and their plural inflectional variants in English. Our goal is to compare two models for the conceptualization of plurality. One model (FRACSS) proposes that all singular-plural pairs should be taken into account when predicting plural semantics from singular semantics. The other model (CCA) argues that conceptualization for plurality depends primarily on the semantic class of the base word. We compare the two models on the basis of how well the speech signal of plural tokens in a large corpus of spoken American English aligns with the semantic vectors predicted by the two models. Two measures are employed: the performance of a form-to-meaning mapping and the correlations between form distances and meaning distances. Results converge on a superior alignment for CCA. Our results suggest that usage-based approaches to pluralization in which a given word's own semantic neighborhood is given priority outperform theories according to which pluralization is conceptualized as a process building on high-level abstraction. We see that what has often been conceived of as a highly abstract concept, [+plural], is better captured via a family of mid-level partial generalizations.
翻译:分布语义学为研究形态学的语义提供了新途径。本研究聚焦英语名词单数及其复数屈折变体的语义。我们的目标是比较两种复数概念化模型。一种模型(FRACSS)认为,在从单数语义预测复数语义时,应考虑所有单复数对。另一种模型(CCA)则主张,复数的概念化主要取决于基础词的语义类别。我们基于大型美国英语口语语料库中复数词元的语音信号与两种模型预测的语义向量的对齐程度,对两种模型进行了比较。采用两种度量方法:形式-意义映射的性能,以及形式距离与意义距离之间的相关性。结果一致显示CCA具有更优的对齐性。我们的研究表明,基于使用的复数化方法——优先考虑特定词汇自身的语义邻域——优于将复数化视为建立在高层抽象基础上的过程的理论。我们发现,常被视为高度抽象概念的[+复数],通过一组中层局部泛化能得到更好的刻画。