High-Performance and Flexible Parallel Algorithms for Semisort and Related Problems - 专知论文

会员服务 ·

0

并行算法 · 并行 · 算法 · 排序 · 渐近最优 ·

2023 年 4 月 20 日

High-Performance and Flexible Parallel Algorithms for Semisort and Related Problems

翻译：高性能且灵活的并行半排序及相关问题算法

Xiaojun Dong,Yunshu Wu,Zhongqi Wang,Laxman Dhulipala,Yan Gu,Yihan Sun

Semisort is a fundamental algorithmic primitive widely used in the design and analysis of efficient parallel algorithms. It takes input as an array of records and a function extracting a \emph{key} per record, and reorders them so that records with equal keys are contiguous. Since many applications only require collecting equal values, but not fully sorting the input, semisort is broadly applicable, e.g., in string algorithms, graph analytics, and geometry processing, among many other domains. However, despite dozens of recent papers that use semisort in their theoretical analysis and the existence of an asymptotically optimal parallel semisort algorithm, most implementations of these parallel algorithms choose to implement semisort by using comparison or integer sorting in practice, due to potential performance issues in existing semisort implementations. In this paper, we revisit the semisort problem, with the goal of achieving a high-performance parallel semisort implementation with a flexible interface. Our approach can easily extend to two related problems, \emph{histogram} and \emph{collect-reduce}. Our algorithms achieve strong speedups in practice, and importantly, outperform state-of-the-art parallel sorting and semisorting methods for almost all settings we tested, with varying input sizes, distribution, and key types. We also test two important applications with real-world data, and show that our algorithms improve the performance over existing approaches. We believe that many other parallel algorithm implementations can be accelerated using our results.

翻译：半排序是一种基础算法原语，广泛应用于高效并行算法的设计与分析。该算法以记录数组和提取每条记录*键值*的函数为输入，通过重新排序使得具有相同键值的记录连续排列。由于许多应用仅需收集相等值而无需完全排序输入，半排序在字符串算法、图分析与几何处理等众多领域具有广泛适用性。然而，尽管近几十篇论文在理论分析中使用半排序，且存在渐近最优的并行半排序算法，但由于现有半排序实现存在潜在性能问题，大多数并行算法在实际中仍通过比较排序或整数排序来实现半排序。本文重新审视半排序问题，旨在实现具有灵活接口的高性能并行半排序方案。我们的方法可轻松扩展到两个相关问题：*直方图*与*收集归约*。我们的算法在实践中实现了强加速，重要的是，在测试的几乎所有场景（包括不同输入规模、分布和键值类型）中，其性能均优于现有最先进的并行排序与半排序方法。我们还使用真实世界数据测试了两个重要应用，结果表明我们的算法相较现有方法提升了性能。我们相信，许多其他并行算法实现均可利用我们的结果进行加速。

0

相关内容

并行算法

【硬核书】稀疏多项式优化:理论与实践，220页pdf

【硬核书】稀疏多项式优化:理论与实践，220页pdf

专知会员服务

73+阅读 · 2022年9月30日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

如何使用TensorFlow 排序构建推荐系统? How to build a recommendation system using TensorFlow Ranking?

如何使用TensorFlow 排序构建推荐系统? How to build a recommendation system using TensorFlow Ranking?

专知会员服务

19+阅读 · 2022年3月13日

【2022新书】强化学习工业应用，408页pdf

【2022新书】强化学习工业应用，408页pdf

专知会员服务

232+阅读 · 2022年2月3日

【2021新书】并行高性能计算，705页pdf，Parallel and High Performance Computing

【2021新书】并行高性能计算，705页pdf，Parallel and High Performance Computing

专知会员服务

108+阅读 · 2021年10月30日

【2020新书】数据科学与机器学习导论，220页pdf

【2020新书】数据科学与机器学习导论，220页pdf

专知会员服务

81+阅读 · 2020年9月14日

【经典书】算法设计与分析，727页pdf，Algorithms Design and Analysis，牛津大学出版社

【经典书】算法设计与分析，727页pdf，Algorithms Design and Analysis，牛津大学出版社

专知会员服务

137+阅读 · 2020年2月25日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

PyTorch Parallel Training（单机多卡并行、混合精度、同步BN训练指南文档）

PyTorch Parallel Training（单机多卡并行、混合精度、同步BN训练指南文档）

CVer

21+阅读 · 2020年6月20日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

AI/ML/DNN硬件加速设计怎么入门？

AI/ML/DNN硬件加速设计怎么入门？

StarryHeavensAbove

11+阅读 · 2018年12月4日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

【推荐】(TensorFlow)SSD实时手部检测与追踪（附代码）

【推荐】(TensorFlow)SSD实时手部检测与追踪（附代码）

机器学习研究会

11+阅读 · 2017年12月5日

【推荐】免费书(草稿)：数据科学的数学基础

【推荐】免费书(草稿)：数据科学的数学基础

机器学习研究会

20+阅读 · 2017年10月1日

低秩张量补全问题的算法研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于优化Schwarz算法的非线性预条件问题

国家自然科学基金

0+阅读 · 2015年12月31日

基于高效蒙特卡罗策略的最优化方法及应用研究

国家自然科学基金

9+阅读 · 2015年12月31日

图的随机p-中心和中位问题的理论和算法研究

国家自然科学基金

1+阅读 · 2014年12月31日

矩阵不等式约束矩阵最小二乘问题的投影算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

电磁场特征值问题的间断 Galerkin 算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Riemann-Hilbert方法及若干相关问题的研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向属性的CPN建模及On the Fly辅助的测试生成方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

非凸与非光滑优化的高效率全局收敛算法

国家自然科学基金

0+阅读 · 2011年12月31日

神经网络子空间学习算法的收敛性与鲁棒性

国家自然科学基金

1+阅读 · 2009年12月31日

Algorithms approaching the threshold for semi-random planted clique

Arxiv

0+阅读 · 2023年6月6日

FireFly: A High-Throughput Hardware Accelerator for Spiking Neural Networks with Efficient DSP and Memory Optimization

Arxiv

0+阅读 · 2023年6月6日

On the Parameterized Complexity of Computing $st$-Orientations with Few Transitive Edges

Arxiv

0+阅读 · 2023年6月5日

Exact solution approaches for the discrete $α$-neighbor $p$-center problem

Arxiv

0+阅读 · 2023年6月5日

Fast and high-order approximation of parabolic equations using hierarchical direct solvers and implicit Runge-Kutta methods

Arxiv

0+阅读 · 2023年6月5日

Agency and legibility for artists through Experiential AI

Arxiv

0+阅读 · 2023年6月4日

Comparison of two coefficients of variation: a new Bayesian approach

Arxiv

0+阅读 · 2023年6月3日

Simulating Noisy Quantum Circuits for Cryptographic Algorithms

Arxiv

0+阅读 · 2023年6月3日

Kernel Metric Learning for Clustering Mixed-type Data

Arxiv

0+阅读 · 2023年6月2日

Optimization for truss design using Bayesian optimization

Arxiv

0+阅读 · 2023年5月27日

VIP会员

文章信息

相关主题

最新内容

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

专知会员服务

2+阅读 · 6月23日

综述 | 世界动作模型：少做梦，多行动

综述 | 世界动作模型：少做梦，多行动

专知会员服务

4+阅读 · 6月23日

美以伊冲突：无人机与人工智能的运用

美以伊冲突：无人机与人工智能的运用

专知会员服务

7+阅读 · 6月23日

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

专知会员服务

3+阅读 · 6月23日

《特种部队在透明战场中的生存力》最新报告

《特种部队在透明战场中的生存力》最新报告

专知会员服务

4+阅读 · 6月23日

《自主无人机蜂群协同与控制系统：人工智能赋能的战场协同与自主任务编排平台》

《自主无人机蜂群协同与控制系统：人工智能赋能的战场协同与自主任务编排平台》

专知会员服务

6+阅读 · 6月23日

《人工智能生成的零日漏洞：对未来作战的影响》

《人工智能生成的零日漏洞：对未来作战的影响》

专知会员服务

5+阅读 · 6月23日

《理解伙伴国在防务能力选择中的偏好：探索美国解决方案的替代选择》美智库200页报告

《理解伙伴国在防务能力选择中的偏好：探索美国解决方案的替代选择》美智库200页报告

专知会员服务

3+阅读 · 6月23日

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

专知会员服务

6+阅读 · 6月22日

综述 | 3D场景图：开放挑战与未来方向

综述 | 3D场景图：开放挑战与未来方向

专知会员服务

8+阅读 · 6月22日

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

专知会员服务

8+阅读 · 6月22日

21世纪的无人机战争

21世纪的无人机战争

专知会员服务

4+阅读 · 6月22日

《伊朗与以色列-美国热战及其对数字技术的影响》

《伊朗与以色列-美国热战及其对数字技术的影响》

专知会员服务

6+阅读 · 6月22日

《量子技术的军事任务技术适配与利用》

《量子技术的军事任务技术适配与利用》

专知会员服务

5+阅读 · 6月22日

《美国陆军军官学校（西点军校）本科生科研中生成式人工智能的使用》

《美国陆军军官学校（西点军校）本科生科研中生成式人工智能的使用》

专知会员服务

9+阅读 · 6月22日

相关VIP内容

【硬核书】稀疏多项式优化:理论与实践，220页pdf

【硬核书】稀疏多项式优化:理论与实践，220页pdf

专知会员服务

73+阅读 · 2022年9月30日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

如何使用TensorFlow 排序构建推荐系统? How to build a recommendation system using TensorFlow Ranking?

如何使用TensorFlow 排序构建推荐系统? How to build a recommendation system using TensorFlow Ranking?

专知会员服务

19+阅读 · 2022年3月13日

【2022新书】强化学习工业应用，408页pdf

【2022新书】强化学习工业应用，408页pdf

专知会员服务

232+阅读 · 2022年2月3日

【2021新书】并行高性能计算，705页pdf，Parallel and High Performance Computing

【2021新书】并行高性能计算，705页pdf，Parallel and High Performance Computing

专知会员服务

108+阅读 · 2021年10月30日

【2020新书】数据科学与机器学习导论，220页pdf

【2020新书】数据科学与机器学习导论，220页pdf

专知会员服务

81+阅读 · 2020年9月14日

【经典书】算法设计与分析，727页pdf，Algorithms Design and Analysis，牛津大学出版社

【经典书】算法设计与分析，727页pdf，Algorithms Design and Analysis，牛津大学出版社

专知会员服务

137+阅读 · 2020年2月25日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

综述 | 世界动作模型：少做梦，多行动

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

美以伊冲突：无人机与人工智能的运用

相关资讯

PyTorch Parallel Training（单机多卡并行、混合精度、同步BN训练指南文档）

PyTorch Parallel Training（单机多卡并行、混合精度、同步BN训练指南文档）

CVer

21+阅读 · 2020年6月20日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

AI/ML/DNN硬件加速设计怎么入门？

AI/ML/DNN硬件加速设计怎么入门？

StarryHeavensAbove

11+阅读 · 2018年12月4日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

【推荐】(TensorFlow)SSD实时手部检测与追踪（附代码）

【推荐】(TensorFlow)SSD实时手部检测与追踪（附代码）

机器学习研究会

11+阅读 · 2017年12月5日

【推荐】免费书(草稿)：数据科学的数学基础

【推荐】免费书(草稿)：数据科学的数学基础

机器学习研究会

20+阅读 · 2017年10月1日

相关论文

Algorithms approaching the threshold for semi-random planted clique

Arxiv

0+阅读 · 2023年6月6日

FireFly: A High-Throughput Hardware Accelerator for Spiking Neural Networks with Efficient DSP and Memory Optimization

Arxiv

0+阅读 · 2023年6月6日

On the Parameterized Complexity of Computing $st$-Orientations with Few Transitive Edges

Arxiv

0+阅读 · 2023年6月5日

Exact solution approaches for the discrete $α$-neighbor $p$-center problem

Arxiv

0+阅读 · 2023年6月5日

Fast and high-order approximation of parabolic equations using hierarchical direct solvers and implicit Runge-Kutta methods

Arxiv

0+阅读 · 2023年6月5日

Agency and legibility for artists through Experiential AI

Arxiv

0+阅读 · 2023年6月4日

Comparison of two coefficients of variation: a new Bayesian approach

Arxiv

0+阅读 · 2023年6月3日

Simulating Noisy Quantum Circuits for Cryptographic Algorithms

Arxiv

0+阅读 · 2023年6月3日

Kernel Metric Learning for Clustering Mixed-type Data

Arxiv

0+阅读 · 2023年6月2日

Optimization for truss design using Bayesian optimization

Arxiv

0+阅读 · 2023年5月27日

相关基金

低秩张量补全问题的算法研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于优化Schwarz算法的非线性预条件问题

国家自然科学基金

0+阅读 · 2015年12月31日

基于高效蒙特卡罗策略的最优化方法及应用研究

国家自然科学基金

9+阅读 · 2015年12月31日

图的随机p-中心和中位问题的理论和算法研究

国家自然科学基金

1+阅读 · 2014年12月31日

矩阵不等式约束矩阵最小二乘问题的投影算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

电磁场特征值问题的间断 Galerkin 算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Riemann-Hilbert方法及若干相关问题的研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向属性的CPN建模及On the Fly辅助的测试生成方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

非凸与非光滑优化的高效率全局收敛算法

国家自然科学基金

0+阅读 · 2011年12月31日

神经网络子空间学习算法的收敛性与鲁棒性

国家自然科学基金

1+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员