In a paper of 1976, Rauzy studied two complexity notions, $\underline{\beta}$ and $\overline{\beta}$, for infinite sequences over a finite alphabet. The function $\underline{\beta}$ is maximum exactly in the Borel normal sequences and $\overline{\beta}$ is minimum exactly in the sequences that, when added to any Borel normal sequence, the result is also Borel normal. Although the definition of $\underline{\beta}$ and $\overline{\beta}$ do not involve finite-state automata, we establish some connections between them and the lower $\underline{\rm dim}$ and upper $\overline{\rm dim}$ finite-state dimension (or other equivalent notions like finite-state compression ratio, aligned-entropy or cumulative log-loss of finite-state predictors). We show tight lower and upper bounds on $\underline{\rm dim}$ and $\overline{\rm dim}$ as functions of $\underline{\beta}$ and $\overline{\beta}$, respectively. In particular this implies that sequences with $\overline{\rm dim}$ zero are exactly the ones that that, when added to any Borel normal sequence, the result is also Borel normal. We also show that the finite-state dimensions $\underline{\rm dim}$ and $\overline{\rm dim}$ are essentially subadditive. We need two technical tools that are of independent interest. One is the family of local finite-state automata, which are automata whose memory consists of the last $k$ read symbols for some fixed integer $k$. We show that compressors based on local finite-state automata are as good as standard finite-state compressors. The other one is a notion of finite-state relational (non-deterministic) compressor, which can compress an input in several ways provided the input can always be recovered from any of its outputs. We show that such compressors cannot compress more than standard (deterministic) finite-state compressors.
翻译:在1976年的一篇论文中,Rauzy研究了有限字母表上无限序列的两个复杂度概念$\underline{\beta}$与$\overline{\beta}$。函数$\underline{\beta}$在Borel正规序列上取最大值,而$\overline{\beta}$在满足“与任意Borel正规序列相加后结果仍为Borel正规”的序列上取最小值。尽管$\underline{\beta}$和$\overline{\beta}$的定义不涉及有限状态自动机,我们建立了它们与有限状态下维数$\underline{\rm dim}$、上维数$\overline{\rm dim}$(或等价概念如有限状态压缩比、对齐熵、有限状态预测器的累积对数损失)之间的联系。我们分别给出了以$\underline{\beta}$和$\overline{\beta}$为自变量的$\underline{\rm dim}$与$\overline{\rm dim}$的紧致上下界。特别地,这证明了$\overline{\rm dim}$为零的序列恰好满足“与任意Borel正规序列相加后结果仍为Borel正规”的性质。我们还证明了有限状态维数$\underline{\rm dim}$和$\overline{\rm dim}$本质上具有次可加性。研究需要两个具有独立价值的技术工具:其一是局部有限状态自动机族,这类自动机的记忆单元仅保留最近读取的$k$个符号($k$为固定整数)。我们证明基于局部有限状态自动机的压缩器与标准有限状态压缩器具有同等效能。其二是有限状态关系型(非确定性)压缩器的概念,这类压缩器能以多种方式压缩输入,但要求从任意输出都能唯一恢复原始输入。我们证明此类压缩器的压缩能力不超过标准(确定性)有限状态压缩器。