By an analogy to the duality between the recurrence time and the longest match length, we introduce a quantity dual to the maximal repetition length, which we call the repetition time. Using the generalized Kac lemma for successive recurrence times by Chen Moy, we sandwich the repetition time in terms of min-entropies with no or relatively short conditioning. The sole assumption is stationarity and ergodicity. The proof is surprisingly short and the claim is fully general in contrast to earlier approaches by Szpankowski and by D\k{e}bowski. We discuss the analogy of this result with the Wyner-Ziv/Ornstein-Weiss theorem, which sandwiches the recurrence time in terms of Shannon entropies. We formulate the respective sandwich bounds in a way that applies also to the case of stretched exponential growth observed empirically for natural language.
翻译:通过与递归时间与最长匹配长度对偶性的类比,我们引入了一个与最大重复长度对偶的量,称之为重复时间。利用Chen Moy关于连续递归时间的广义Kac引理,我们使用无条件或相对短条件的最小熵对重复时间进行了区间界定。唯一假设是平稳性与遍历性。与Szpankowski和Dębowski先前的方法相比,本证明异常简洁,且结论具有完全普适性。我们讨论了该结果与Wyner-Ziv/Ornstein-Weiss定理的类比关系——后者使用香农熵对递归时间进行区间界定。我们以适用于自然语言中观察到的拉伸指数增长情况的方式,分别给出了相应的区间界。