A sharp tension exists about the nature of human language between two opposite parties: those who believe that statistical surface distributions, in particular using measures like surprisal, provide a better understanding of language processing, vs. those who believe that discrete hierarchical structures implementing linguistic information such as syntactic ones are a better tool. In this paper, we show that this dichotomy is a false one. Relying on the fact that statistical measures can be defined on the basis of either structural or non-structural models, we provide empirical evidence that only models of surprisal that reflect syntactic structure are able to account for language regularities.
翻译:关于人类语言的性质,两种对立观点之间存在尖锐张力:一方认为统计表面分布(特别是使用诸如surprisal等度量)能更好地理解语言处理过程;另一方则认为实现句法等语言学信息的离散层级结构是更优工具。本文证明这种二分法是错误的。基于统计度量可基于结构模型或非结构模型定义这一事实,我们提供的实证证据表明:只有反映句法结构的surprisal模型才能解释语言规律。