The Burrows--Wheeler transform is usually viewed as a clustering transform: it tends to group equal letters into long runs. We study the opposite extremal regime, where the BWT output is completely unclustered, that is, has as many equal-letter runs as positions. Known results imply, on the one hand, that the number of runs in the BWT of a Lyndon word can increase by at most a factor of two, and, on the other hand, that over every alphabet of size at least three completely unclustered BWT images exist in every length. This leads to the extremal problem lying between these two facts. For \(k\ge3\), let \(U_k(n)\) be the minimum cyclic run number of a primitive necklace of length \(n\) whose BWT has \(n\) runs. We prove the universal lower bound \(U_k(n)\ge\lceil n/2\rceil\), reduce the sharpness problem for one-cycle BWT images \(L\) to the Hamming identity \[ \cruns(\BWT^{-1}(L))=\dH(L,\sort(L)), \] and develop a natural multiset-of-necklaces relaxation with an explicit constant-cycle correction. We compute the small values, including the exceptional value \(U_k(6)=4\), prove a parity obstruction for the Parikh vectors of sharp examples, and determine the multiset relaxation exactly. Finally, for every prime \(p\equiv5\pmod8\) for which \(2\) is a primitive root modulo \(p\), we prove sharpness in the adjacent lengths \(p-1\) and \(p\). Under the corresponding Artin-type infinitude hypothesis, this gives infinitely many adjacent sharp pairs.
翻译:Burrows-Wheeler变换通常被视为一种聚集变换:它倾向于将相等字母分组为长游程。我们研究相反的极值区域,即BWT输出完全非聚集,其相等字母游程数与位置数相等。已知结果一方面表明,Lyndon词的BWT中游程数最多可增加一倍;另一方面,在至少包含三个字母的字母表上,任何长度下均存在完全非聚集的BWT像。这引出了介于这两个事实之间的极值问题。对于\(k\ge3\),设\(U_k(n)\)为长度为\(n\)且其BWT具有\(n\)个游程的本原项链的最小循环游程数。我们证明了通用下界\(U_k(n)\ge\lceil n/2\rceil\),将单周期BWT像\(L\)的尖锐性问题归约为Hamming恒等式\[ \cruns(\BWT^{-1}(L))=\dH(L,\sort(L)) \],并发展了一个具有显式常数周期校正的自然项链多重集松弛。我们计算了小数值,包括例外值\(U_k(6)=4\),证明了尖锐例子Parikh向量的奇偶性障碍,并精确确定了多重集松弛。最后,对于每个使得\(2\)是模\(p\)的原根的素数\(p\equiv5\pmod8\),我们证明了相邻长度\(p-1\)和\(p\)上的尖锐性。在相应的Artin型无穷性假设下,这给出了无穷多对相邻尖锐对。