Language is a uniquely human trait, conveying information efficiently by organizing word sequences in sentences into hierarchical structures. A central question persists: Why is human language hierarchical? In this study, we show that hierarchization optimally solves the challenge of our limited working memory capacity. We established a likelihood function that quantifies how well the average number of units according to the language processing mechanisms aligns with human working memory capacity (WMC) in a direct fashion. The maximum likelihood estimate (MLE) of this function, tehta_MLE, turns out to be the mean of units. Through computational simulations of symbol sequences and validation analyses of natural language sentences, we uncover that compared to linear processing, hierarchical processing far surpasses it in constraining the tehta_MLE values under the human WMC limit, along with the increase of sequence/sentence length successfully. It also shows a converging pattern related to children's WMC development. These results suggest that constructing hierarchical structures optimizes the processing efficiency of sequential language input while staying within memory constraints, genuinely explaining the universal hierarchical nature of human language.
翻译:语言是人类独有的特征,通过将句子中的词序列组织为层级结构来高效传递信息。一个核心问题始终存在:为何人类语言具有层级性?本研究表明,层级化过程以最优方式解决了我们有限的工作记忆容量带来的挑战。我们建立了一个似然函数,该函数直接量化了语言处理机制下的平均单元数量与人类工作记忆容量(WMC)的匹配程度。该函数的极大似然估计值 tehta_MLE 恰好是单元数量的均值。通过对符号序列的计算模拟以及对自然语言句子的验证分析,我们发现,与线性处理相比,层级处理在成功增加序列/句子长度的同时,能更有效地将 tehta_MLE 值约束在人类 WMC 极限之下,并表现出与儿童 WMC 发展相关的收敛模式。这些结果表明,构建层级结构能在记忆限制内优化序列语言输入的处理效率,从而真正解释了人类语言普遍具有层级性的本质。