Sorting is the task of ordering $n$ elements using pairwise comparisons. It is well known that $m=\Theta(n\log n)$ comparisons are both necessary and sufficient when the outcomes of the comparisons are observed with no noise. In this paper, we study the sorting problem in the presence of noise. Unlike the common approach in the literature which aims to minimize the number of pairwise comparisons $m$ to achieve a given desired error probability, we consider randomized algorithms with expected number of queries $\mathsf{E}[M]$ and aim at characterizing the maximal ratio $\frac{n\log n}{\mathsf{E}[M]}$ such that the ordering of the elements can be estimated with a vanishing error probability asymptotically. The maximal ratio is referred to as the noisy sorting capacity. In this work, we derive upper and lower bounds on the noisy sorting capacity. We establish two lower bounds, one for fixed-length algorithms and one for variable-length algorithms. The proposed algorithms exploit the connection between noisy searching and channel coding with feedback and incorporate the insertion sort algorithm with the Burnashev-Zigangirov algorithm for channel coding with feedback. Moreover, we derive an upper bound for the noisy sorting capacity and an upper bound for the achievable rates of algorithms based on insertion sort.
翻译:排序是通过成对比较对$n$个元素进行排序的任务。众所周知,当比较结果无噪声时,$m=\Theta(n\log n)$次比较既必要又充分。本文研究了存在噪声时的排序问题。与文献中常见的旨在最小化成对比较次数$m$以实现给定期望错误概率的方法不同,我们考虑期望查询次数为$\mathsf{E}[M]$的随机算法,并致力于刻画使得元素排序能以渐近消失的错误概率被估计的最大比率$\frac{n\log n}{\mathsf{E}[M]}$。该最大比率被称为噪声排序容量。在本工作中,我们推导了噪声排序容量的上界和下界。我们建立了两个下界:一个针对固定长度算法,另一个针对可变长度算法。所提出的算法利用了噪声搜索与带反馈信道编码之间的联系,并将插入排序算法与用于带反馈信道编码的Burnashev-Zigangirov算法相结合。此外,我们推导了噪声排序容量的一个上界,以及基于插入排序算法的可达速率的一个上界。