The normalized substring complexity $\delta$ of a string is defined as $\max_k \{c[k]/k\}$, where $c[k]$ is the number of \textit{distinct} substrings of length $k$. This simply defined measure has recently attracted attention due to its established relationship to popular string compression algorithms. We consider the problem of computing $\delta$ online, when the string is provided from a stream. We present two algorithms solving the problem: one working in $O(\log n)$ amortized time per character, and the other in $O(\log^3 n)$ worst-case time per character. To our knowledge, this is the first polylog-time online solution to this problem.
翻译:字符串的归一化子串复杂度 $\delta$ 定义为 $\max_k \{c[k]/k\}$,其中 $c[k]$ 表示长度为 $k$ 的\textit{互异}子串的数量。这一简洁定义的度量指标因其与主流字符串压缩算法之间的明确关系,近期受到广泛关注。本文研究在字符串以流形式提供时,在线计算 $\delta$ 的问题。我们提出了两种解决该问题的算法:一种算法在每个字符上的均摊时间复杂度为 $O(\log n)$,另一种算法在每个字符上的最坏时间复杂度为 $O(\log^3 n)$。据我们所知,这是该问题的首个多对数时间在线解决方案。