We present a novel characterization of stable mergesort functions using relational parametricity, and show that it implies the correctness of mergesort. As a result, one can prove the correctness of several variations of mergesort (e.g., top-down, bottom-up, tail-recursive, non-tail-recursive, smooth, and non-smooth mergesorts) by proving the characterization property for each variation. To further motivate this work, we show a performance trade-off between tail-recursive and non-tail-recursive mergesorts that (1) the former in call-by-value evaluation avoids using up stack space and is efficient and (2) the latter in call-by-need evaluation is an optimal incremental sort, meaning that it performs only $\mathcal{O}(n + k \log k)$ comparisons to compute the least (or greatest) $k$ items of a list of length $n$. Thanks to our characterization and the parametricity translation, we deduced the correctness results, including stability, of various implementations of mergesort for lists, including highly optimized ones, in the Coq proof assistant.
翻译:我们利用关系参数化提出了一种稳定的归并排序函数的新颖刻画,并证明该刻画蕴含归并排序的正确性。由此,只需证明每种变体满足该刻画性质,即可验证多种归并排序变体(如自顶向下、自底向上、尾递归、非尾递归、平滑与非平滑归并排序)的正确性。为进一步推动此项工作,我们揭示了尾递归与非尾递归归并排序之间的性能权衡:(1)在传值调用求值中,前者可避免栈空间消耗且运行高效;(2)在传需要求调用求值中,后者作为最优增量排序算法,仅需 $\mathcal{O}(n + k \log k)$ 次比较即可从长度为 $n$ 的列表中找出最小(或最大)的 $k$ 个元素。得益于我们的刻画方法与参数化翻译,我们在 Coq 证明助手中推导出包括高度优化版本在内的多种列表归并排序实现的正确性结论(包含稳定性)。