Artificial neural networks can generalize productively to novel contexts. Can they also learn exceptions to those productive rules? We explore this question using the case of restrictions on English passivization (e.g., the fact that "The vacation lasted five days" is grammatical, but "*Five days was lasted by the vacation" is not). We collect human acceptability judgments for passive sentences with a range of verbs, and show that the probability distribution defined by GPT-2, a language model, matches the human judgments with high correlation. We also show that the relative acceptability of a verb in the active vs. passive voice is positively correlated with the relative frequency of its occurrence in those voices. These results provide preliminary support for the entrenchment hypothesis, according to which learners track and uses the distributional properties of their input to learn negative exceptions to rules. At the same time, this hypothesis fails to explain the magnitude of unpassivizability demonstrated by certain individual verbs, suggesting that other cues to exceptionality are available in the linguistic input.
翻译:人工神经网络能够对新语境进行高效泛化。但这类网络能否同时学习这些高效规则的例外情况?我们以英语被动化限制为例(例如,"The vacation lasted five days"符合语法,而"*Five days was lasted by the vacation"则不然),探究该问题。通过收集母语者对含不同动词的被动句的可接受性判断,我们发现语言模型GPT-2定义的概率分布与人类判断高度相关。同时研究表明,动词在主动语态与被动语态中的相对可接受性,与其在对应语态中的相对出现频率呈正相关。这些结果为固化假说提供了初步证据支持——该假说认为学习者通过追踪并利用输入文本的分布特征,习得规则的否定性例外。然而,该假说无法解释某些特定动词在不可被动化程度上呈现的显著差异,暗示语言输入中还存在其他标识例外情况的线索。