Here we consider the problem of all the possible orders of a linguistic structure formed by $n$ elements, for instance, subject, direct object and verb ($n=3$) or subject, direct object, indirect object and verb ($n=4$). We investigate if the frequency of the $n!$ possible orders is constrained by two principles. First, entropy minimization, a principle that has been suggested to shape natural communication systems at distinct levels of organization. Second, swap distance minimization, namely a preference for word orders that require fewer swaps of adjacent elements to be produced from a source order. Here we present average swap distance, a novel score for research on swap distance minimization, and investigate the theoretical distribution of that score for any $n$: its minimum and maximum values and its expected value in die rolling experiments or when the word order frequencies are shuffled. We investigate whether entropy and average swap distance are significantly small in distinct linguistic structures with $n=3$ or $n=4$ in agreement with the corresponding minimization principles. We find strong evidence of entropy minimization and swap distance minimization with respect to a die rolling experiment. The evidence of these two forces with respect to a Polya urn process is strong for $n=4$ but weaker for $n=3$. We still find evidence of swap distance minimization when word order frequencies are shuffled, indicating that swap distance minimization effects are beyond pressure to minimize word order entropy.
翻译:本文考虑由n个元素构成的语言结构的所有可能语序问题,例如主语、直接宾语和动词(n=3)或主语、直接宾语、间接宾语和动词(n=4)。我们研究n!种可能语序的频率是否受两个原则约束:第一,熵最小化——这一原则被认为在不同组织层次上塑造自然交流系统;第二,交换距离最小化——即偏好那些从源语序出发需要较少相邻元素交换才能生成的语序。我们提出平均交换距离这一研究交换距离最小化的新指标,并探究该指标对于任意n值的理论分布:其最小值和最大值,以及在掷骰子实验或语序频率随机打乱时的期望值。我们考察在n=3或n=4的不同语言结构中,熵与平均交换距离是否显著偏小,从而符合相应的最小化原则。我们发现了与掷骰子实验相比熵最小化和交换距离最小化的强有力证据。相对于波利亚罐子过程而言,这两种力的证据在n=4时较强,但在n=3时较弱。即使在语序频率被打乱的情况下,我们仍发现交换距离最小化的证据,表明交换距离最小化的效应超越了语序熵最小化的压力。