We construct multiperiodic processes -- a simple example of stationary ergodic (but not mixing) processes over natural numbers that enjoy the vanishing entropy rate under a mild condition. Multiperiodic processes are supported on randomly shifted deterministic sequences called multiperiodic sequences, which can be efficiently generated using an algorithm called the Infinite Clock. Under a suitable parameterization, multiperiodic sequences exhibit relative frequencies of particular numbers given by Zipf's law. Exactly in the same setting, the respective multiperiodic processes satisfy an asymptotic power-law growth of block entropy, called Hilberg's law. Hilberg's law is deemed to hold for statistical language models, in particular.
翻译:我们构造了多周期过程——一种在自然数上定义的平稳遍历(但不混合)过程的简单示例,其在温和条件下具有消失的熵率。多周期过程建立在称为多周期序列的随机平移确定性序列上,这类序列可通过名为“无限时钟”的算法高效生成。在适当的参数化下,多周期序列中特定数字的相对频率服从齐普夫定律。在完全相同的设定下,相应的多周期过程满足块熵的渐近幂律增长,称为希尔伯格定律。该定律被认为适用于统计语言模型,尤其如此。