Tight Bounds for Sorting Under Partial Information

Sorting has a natural generalization where the input consists of: (1) a ground set $X$ of size $n$, (2) a partial oracle $O_P$ specifying some fixed partial order $P$ on $X$ and (3) a linear oracle $O_L$ specifying a linear order $L$ that extends $P$. The goal is to recover the linear order $L$ on $X$ using the fewest number of linear oracle queries. In this problem, we measure algorithmic complexity through three metrics: oracle queries to $O_L$, oracle queries to $O_P$, and the time spent. Any algorithm requires worst-case $\log_2 e(P)$ linear oracle queries to recover the linear order on $X$. Kahn and Saks presented the first algorithm that uses $\Theta(\log e(P))$ linear oracle queries (using $O(n^2)$ partial oracle queries and exponential time). The state-of-the-art for the general problem is by Cardinal, Fiorini, Joret, Jungers and Munro who at STOC'10 manage to separate the linear and partial oracle queries into a preprocessing and query phase. They can preprocess $P$ using $O(n^2)$ partial oracle queries and $O(n^{2.5})$ time. Then, given $O_L$, they uncover the linear order on $X$ in $\Theta(\log e(P))$ linear oracle queries and $O(n + \log e(P))$ time -- which is worst-case optimal in the number of linear oracle queries but not in the time spent. For $c \geq 1$, our algorithm can preprocess $O_P$ using $O(n^{1 + \frac{1}{c}})$ queries and time. Given $O_L$, we uncover $L$ using $\Theta(c \log e(P))$ queries and time. We show a matching lower bound, as there exist positive constants $(\alpha, \beta)$ where for any constant $c \geq 1$, any algorithm that uses at most $\alpha \cdot n^{1 + \frac{1}{c}}$ preprocessing must use worst-case at least $\beta \cdot c \log e(P)$ linear oracle queries. Thus, we solve the problem of sorting under partial information through an algorithm that is asymptotically tight across all three metrics.

翻译：排序有一个自然的推广，其输入包括：(1) 一个大小为 $n$ 的基集 $X$，(2) 指定 $X$ 上某个固定偏序 $P$ 的偏序预言机 $O_P$，以及 (3) 指定扩展 $P$ 的线性序 $L$ 的线性预言机 $O_L$。目标是利用最少的线性预言机查询次数恢复 $X$ 上的线性序 $L$。在此问题中，我们通过三个度量来衡量算法复杂度：对 $O_L$ 的预言机查询次数、对 $O_P$ 的预言机查询次数以及运行时间。任何算法在最坏情况下都需要 $\log_2 e(P)$ 次线性预言机查询才能恢复 $X$ 上的线性序。Kahn 和 Saks 提出了首个使用 $\Theta(\log e(P))$ 次线性预言机查询的算法（使用 $O(n^2)$ 次偏序预言机查询和指数时间）。该问题的最新进展来自 Cardinal、Fiorini、Joret、Jungers 和 Munro，他们在 STOC'10 上成功地将线性预言机和偏序预言机查询分离为预处理阶段和查询阶段。他们可以使用 $O(n^2)$ 次偏序预言机查询和 $O(n^{2.5})$ 时间预处理 $P$。然后，给定 $O_L$，他们使用 $\Theta(\log e(P))$ 次线性预言机查询和 $O(n + \log e(P))$ 时间揭示 $X$ 上的线性序——这在线性预言机查询次数上是最坏情况最优的，但运行时间并非最优。对于 $c \geq 1$，我们的算法可以使用 $O(n^{1 + \frac{1}{c}})$ 次查询和时间预处理 $O_P$。给定 $O_L$，我们使用 $\Theta(c \log e(P))$ 次查询和时间揭示 $L$。我们给出了一个匹配的下界：存在正常数 $(\alpha, \beta)$，使得对于任意常数 $c \geq 1$，任何使用最多 $\alpha \cdot n^{1 + \frac{1}{c}}$ 次预处理的算法，在最坏情况下必须使用至少 $\beta \cdot c \log e(P)$ 次线性预言机查询。因此，我们通过一个在所有三个度量上渐近紧的算法解决了部分信息下的排序问题。