Fredman proposed in 1976 the following algorithmic problem: Given are a ground set $X$, some partial order $P$ over $X$, and some comparison oracle $O_L$ that specifies a linear order $L$ over $X$ that extends $P$. A query to $O_L$ has as input distinct $x, x' \in X$ and outputs whether $x <_L x'$ or vice versa. If we denote by $e(P)$ the number of linear extensions of $P$, then $\log e(P)$ is a worst-case lower bound on the number of queries needed to output the sorted order of $X$. Fredman did not specify in what form the partial order is given. Haeupler, Hladík, Iacono, Rozhon, Tarjan, and Tětek ('24) propose to assume as input a directed acyclic graph, $G$, with $m$ edges and $n=|X|$ vertices. Denote by $P_G$ the partial order induced by $G$. Algorithmic performance is measured in running time and the number of queries used, where they use $Θ(m + n + \log e(P_G))$ time and $Θ(\log e(P_G))$ queries to output $X$ in its sorted order. Their algorithm is worst-case optimal in terms of running time and queries, both. Their algorithm combines topological sorting with heapsort. Their analysis relies upon sophisticated counting arguments using entropy, recursively defined sets defined over the run of their algorithm, and vertices in the graph that they identify as bottlenecks for sorting. In this paper, we do away with sophistication. We show that when the input is a directed acyclic graph then the problem admits a simple solution using $Θ(m + n + \log e(P_G))$ time and $Θ(\log e(P_G))$ queries. Especially our proofs are much simpler as we avoid the usage of advanced charging arguments and data structures, and instead rely upon two brief observations.
翻译:Fredman 于 1976 年提出了以下算法问题:给定一个基础集合 $X$、$X$ 上的某个偏序关系 $P$,以及一个指定了 $P$ 的线性扩展 $L$ 的比较预言机 $O_L$。对 $O_L$ 的一次查询以不同的 $x, x' \in X$ 作为输入,并输出 $x <_L x'$ 或反之。若记 $e(P)$ 为 $P$ 的线性扩展数量,则 $\log e(P)$ 是输出 $X$ 排序顺序所需查询次数的最坏情况下界。Fredman 未明确说明偏序关系的具体输入形式。Haeupler、Hladík、Iacono、Rozhon、Tarjan 和 Tětek('24)提出假设输入为一个具有 $m$ 条边和 $n=|X|$ 个顶点的有向无环图 $G$,记 $P_G$ 为由 $G$ 导出的偏序关系。算法性能通过运行时间和查询次数来衡量,他们使用 $Θ(m + n + \log e(P_G))$ 的时间和 $Θ(\log e(P_G))$ 的查询次数来输出 $X$ 的排序顺序。他们的算法在运行时间和查询次数上均达到最坏情况最优。该算法结合了拓扑排序与堆排序,其分析依赖于基于熵的复杂计数论证、算法运行过程中递归定义的集合,以及被识别为排序瓶颈的图中顶点。本文摒弃了这种复杂性。我们证明,当输入为有向无环图时,该问题存在一种简单的解决方案,仅需 $Θ(m + n + \log e(P_G))$ 的时间和 $Θ(\log e(P_G))$ 的查询次数。特别地,我们的证明更为简洁,避免了使用复杂的计费论证和数据结构,而仅依赖于两个简要的观察。