Making sense of spoken plurals

from arxiv, 29 pages including references, 24 pages excluding references, 11 Figures, 3 Tables. This article is under review in "The Mental Lexicon" journal

Distributional semantics offers new ways to study the semantics of morphology. This study focuses on the semantics of noun singulars and their plural inflectional variants in English. Our goal is to compare two models for the conceptualization of plurality. One model (FRACSS) proposes that all singular-plural pairs should be taken into account when predicting plural semantics from singular semantics. The other model (CCA) argues that conceptualization for plurality depends primarily on the semantic class of the base word. We compare the two models on the basis of how well the speech signal of plural tokens in a large corpus of spoken American English aligns with the semantic vectors predicted by the two models. Two measures are employed: the performance of a form-to-meaning mapping and the correlations between form distances and meaning distances. Results converge on a superior alignment for CCA. Our results suggest that usage-based approaches to pluralization in which a given word's own semantic neighborhood is given priority outperform theories according to which pluralization is conceptualized as a process building on high-level abstraction. We see that what has often been conceived of as a highly abstract concept, [+plural], is better captured via a family of mid-level partial generalizations.

翻译：分布语义学为研究形态学的语义提供了新途径。本研究聚焦英语名词单数及其复数屈折变体的语义。我们的目标是比较两种复数概念化模型。一种模型（FRACSS）认为，在从单数语义预测复数语义时，应考虑所有单复数对。另一种模型（CCA）则主张，复数的概念化主要取决于基础词的语义类别。我们基于大型美国英语口语语料库中复数词元的语音信号与两种模型预测的语义向量的对齐程度，对两种模型进行了比较。采用两种度量方法：形式-意义映射的性能，以及形式距离与意义距离之间的相关性。结果一致显示CCA具有更优的对齐性。我们的研究表明，基于使用的复数化方法——优先考虑特定词汇自身的语义邻域——优于将复数化视为建立在高层抽象基础上的过程的理论。我们发现，常被视为高度抽象概念的[+复数]，通过一组中层局部泛化能得到更好的刻画。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日