zSort: Stable Distribution Sort using Z-Score Partitioning

Sorting is a foundational primitive in modern data processing, influencing the execution speed of high-performance data pipelines. However, the algorithmic landscape is currently bifurcated by a pervasive "Stability Tax": practitioners must sacrifice either order preservation for high throughput or execution speed for stability. To address these limitations, this paper introduces, zSort, an adaptive z-score based distribution sorting algorithm that guarantees stability while avoiding pass complexity that scales with key-width. The performance of the proposed technique is evaluated using Microarchitectural analysis and experimental results. Microarchitectural analysis shows that zSort achieves a lower bad-speculation overhead (19.7%) than both stable baselines and several high-performance unstable algorithms and sustains a competitive IPC of 1.44. Empirical evaluation across diverse input distributions and datasets of up to 10^7 elements (64 bit) demonstrates that zSort consistently outperforms widely used comparison based stable sorting algorithms, achieving up to 3x-4.5x speedups, and a relatively better performance compared to LSD Radix, with larger gains on duplicate heavy and partially ordered inputs. Despite providing stability, zSort achieves comparable throughput as compared to high-performance unstable algorithms such as Skasort. It also maintains this performance on adaptive workloads where methods like Pdqsort typically excel and doesn't exhibit any extreme worst case. These results indicate that zSort substantially narrows the traditional performance gap between stable and unstable sorting and provides an efficient, stable sorting alternative.

翻译：排序是现代数据处理中的基础原语，直接影响高性能数据管道的执行速度。然而，当前算法领域面临普遍存在的"稳定性代价"问题：开发者必须在维持顺序以获取高吞吐量或牺牲执行速度以保证稳定性之间做出抉择。为解决这一局限，本文提出zSort算法——一种基于自适应Z分数的分布排序算法，在保证稳定性的同时避免随键值宽度扩展的轮次复杂度。通过微架构分析与实验结果评估其性能。微架构分析表明，zSort的不良推测开销（19.7%）低于稳定基线算法及多种高性能非稳定算法，并保持1.44的竞争性IPC。针对多样输入分布及包含10^7个元素（64位）数据集的实证评估显示，zSort持续优于广泛使用的基于比较的稳定排序算法，速度提升达3倍至4.5倍，且相较于LSD基数排序性能更优，在高重复及部分有序输入上优势更显著。尽管提供稳定性保证，zSort仍能达到与Skasort等高性能非稳定算法相当的吞吐量，并在Pdqsort等算法通常擅长的自适应工作负载中保持同等性能，且未出现极端最坏情况。这些结果表明，zSort显著缩小了传统上稳定与非稳定排序的性能差距，提供了一种高效、稳定的排序替代方案。

相关内容

排序

关注 313

排序是计算机内经常进行的一种操作，其目的是将一组“无序”的记录序列调整为“有序”的记录序列。分内部排序和外部排序。若整个排序过程不需要访问外存便能完成，则称此类排序问题为内部排序。反之，若参加排序的记录数量很大，整个序列的排序过程不可能在内存中完成，则称此类排序问题为外部排序。内部排序的过程是一个逐步扩大记录的有序序列长度的过程。

【佐治亚理工博士论文】基于策略智能体和有限反馈的序列决策

专知会员服务

62+阅读 · 2023年4月12日

【干货书】分布式机器学习的优化算法，137页pdf

专知会员服务

74+阅读 · 2022年12月14日

如何使用TensorFlow 排序构建推荐系统? How to build a recommendation system using TensorFlow Ranking?

专知会员服务

19+阅读 · 2022年3月13日

【CMU-Yuejie Chi等干货书】满足低秩矩阵分解的非凸优化综述，69页pdf，Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview

专知会员服务

33+阅读 · 2022年3月4日