We study the connections between sorting and the binary search tree (BST) model, with an aim towards showing that the fields are connected more deeply than is currently appreciated. While any BST can be used to sort by inserting the keys one-by-one, this is a very limited relationship and importantly says nothing about parallel sorting. We show what we believe to be the first formal relationship between the BST model and sorting. Namely, we show that a large class of sorting algorithms, which includes mergesort, quicksort, insertion sort, and almost every instance-optimal sorting algorithm, are equivalent in cost to offline BST algorithms. Our main theoretical tool is the geometric interpretation of the BST model introduced by Demaine et al., which finds an equivalence between searches on a BST and point sets in the plane satisfying a certain property. To give an example of the utility of our approach, we introduce the log-interleave bound, a measure of the information-theoretic complexity of a permutation $\pi$, which is within a $\lg \lg n$ multiplicative factor of a known lower bound in the BST model; we also devise a parallel sorting algorithm with polylogarithmic span that sorts a permutation $\pi$ using comparisons proportional to its log-interleave bound. Our aforementioned result on sorting and offline BST algorithms can be used to show existence of an offline BST algorithm whose cost is within a constant factor of the log-interleave bound of any permutation $\pi$.
翻译:我们研究排序与二叉搜索树(BST)模型之间的关联,旨在证明这两个领域之间存在比当前认知更深刻的联系。虽然任何BST都可以通过逐个插入键值来实现排序,但这种关系极为有限,且重要的是无法体现并行排序。我们展示了首个关于BST模型与排序之间的形式化关系:包括归并排序、快速排序、插入排序以及几乎所有实例最优排序算法在内的大类排序算法,其代价均与离线BST算法等价。我们的核心理论工具是Demaine等人提出的BST模型几何解释,该解释揭示了BST上的搜索与平面中满足特定性质的点集之间的等价性。为说明该方法的实用性,我们引入对数交错界作为排列π的信息论复杂度度量,该界在BST模型中与已知下界仅相差一个lglgn乘法因子;同时设计了一种具有多对数级跨度的并行排序算法,其排序π所需的比较次数与该排列的对数交错界成正比。上述关于排序与离线BST算法的结论可用于证明存在一种离线BST算法,其代价始终保持在任意排列π对数交错界的常数因子范围内。