ytopt: Autotuning Scientific Applications for Energy Efficiency at Large Scales - 专知论文

会员服务 ·

0

自动调优 · 能效 · 能量效率 · 代理模型 · 生产系统 ·

2023 年 3 月 28 日

ytopt: Autotuning Scientific Applications for Energy Efficiency at Large Scales

翻译：ytopt：面向大规模科学计算能效的自动调优框架

Xingfu Wu,Prasanna Balaprakash,Michael Kruse,Jaehoon Koo,Brice Videau,Paul Hovland,Valerie Taylor,Brad Geltz,Siddhartha Jana,Mary Hall

As we enter the exascale computing era, efficiently utilizing power and optimizing the performance of scientific applications under power and energy constraints has become critical and challenging. We propose a low-overhead autotuning framework to autotune performance and energy for various hybrid MPI/OpenMP scientific applications at large scales and to explore the tradeoffs between application runtime and power/energy for energy efficient application execution, then use this framework to autotune four ECP proxy applications -- XSBench, AMG, SWFFT, and SW4lite. Our approach uses Bayesian optimization with a Random Forest surrogate model to effectively search parameter spaces with up to 6 million different configurations on two large-scale production systems, Theta at Argonne National Laboratory and Summit at Oak Ridge National Laboratory. The experimental results show that our autotuning framework at large scales has low overhead and achieves good scalability. Using the proposed autotuning framework to identify the best configurations, we achieve up to 91.59% performance improvement, up to 21.2% energy savings, and up to 37.84% EDP improvement on up to 4,096 nodes.

翻译：随着我们进入百亿亿次计算时代，在功耗与能量约束下高效利用能源并优化科学计算应用性能变得至关重要且充满挑战。本文提出了一种低开销的自动调优框架，用于在大规模混合MPI/OpenMP科学计算应用中自动优化性能与能耗，并探索应用运行时间与功耗/能耗之间的权衡关系以实现低能耗执行。基于此框架，我们对四个ECP代理应用——XSBench、AMG、SWFFT和SW4lite进行了自动调优。该方法采用结合随机森林代理模型的贝叶斯优化策略，在阿贡国家实验室Theta系统和橡树岭国家实验室Summit系统两个大规模生产平台上，高效搜索包含多达600万种不同配置的参数空间。实验结果表明，该自动调优框架在大规模系统中具有低开销和良好的可扩展性。利用所提框架识别最优配置，我们在最高4096个节点上实现了高达91.59%的性能提升、21.2%的能耗降低以及37.84%的EDP改善。

0

相关内容

自动调优

ICLR | 训练面向分子模拟的十亿级参数 GNN

ICLR | 训练面向分子模拟的十亿级参数 GNN

专知会员服务

8+阅读 · 2022年6月27日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

128+阅读 · 2022年4月21日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【清华大学】Delta调优:预训练语言模型参数有效方法的综合研究，Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

【清华大学】Delta调优:预训练语言模型参数有效方法的综合研究，Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

专知会员服务

26+阅读 · 2022年3月15日

【2021新书】并行高性能计算，705页pdf，Parallel and High Performance Computing

【2021新书】并行高性能计算，705页pdf，Parallel and High Performance Computing

专知会员服务

108+阅读 · 2021年10月30日

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

专知会员服务

49+阅读 · 2020年5月26日

【2020新书】Python大数据处理，Mastering Large Datasets with Python

【2020新书】Python大数据处理，Mastering Large Datasets with Python

专知会员服务

54+阅读 · 2020年2月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

80+阅读 · 2019年10月10日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

大数据 | 顶级SCI期刊专刊/国际会议信息7条

大数据 | 顶级SCI期刊专刊/国际会议信息7条

Call4Papers

10+阅读 · 2018年12月29日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

专知

26+阅读 · 2018年5月22日

再生水分目标回用过程能量流评估与优化的仿真方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

风电波动下电力系统随机小干扰稳定性分析

国家自然科学基金

0+阅读 · 2014年12月31日

面向非线性非高斯数据的因果结构学习算法研究

国家自然科学基金

2+阅读 · 2013年12月31日

in silico生物分子网络动力学参数高速与高精度自动化估计的研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向动态规划计算的并行编程模型和运行时系统研究

国家自然科学基金

0+阅读 · 2013年12月31日

GaN基功率器件基础问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

复杂大化工过程的分布式广义预测控制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于热-机-电耦合模型的大型动力电池性能优化研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向IMT-A的femtocell绿色自组织关键问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

航线网络复杂性和统一优化模型研究

国家自然科学基金

0+阅读 · 2011年12月31日

Bayesian and Multi-Armed Contextual Meta-Optimization for Efficient Wireless Radio Resource Management

Arxiv

0+阅读 · 2023年5月19日

Transfer operators on graphs: Spectral clustering and beyond

Arxiv

0+阅读 · 2023年5月19日

Faster Parallel Exact Density Peaks Clustering

Arxiv

0+阅读 · 2023年5月18日

CaRE: Finding Root Causes of Configuration Issues in Highly-Configurable Robots

Arxiv

0+阅读 · 2023年5月18日

Optimization of body configuration and joint-driven attitude stabilization for transformable spacecrafts under solar radiation pressure

Arxiv

0+阅读 · 2023年5月18日

Mapi-Pro: An Energy Efficient Memory Mapping Technique for Intermittent Computing

Arxiv

0+阅读 · 2023年5月17日

Robust Power Allocation for Integrated Visible Light Positioning and Communication Networks

Arxiv

0+阅读 · 2023年5月17日

State machines for large scale computer software and systems

Arxiv

0+阅读 · 2023年5月16日

Spectral Clustering via Orthogonalization-Free Methods

Arxiv

0+阅读 · 2023年5月16日

MATCH: Metadata-Aware Text Classification in A Large Hierarchy

Arxiv

12+阅读 · 2021年2月15日

VIP会员

文章信息

相关主题

最新内容

无人机自主控制与人工智能：系统性综述

无人机自主控制与人工智能：系统性综述

专知会员服务

5+阅读 · 今天7:25

巡飞弹与反无人机系统——现代战场的两大支柱

巡飞弹与反无人机系统——现代战场的两大支柱

专知会员服务

2+阅读 · 今天6:54

《打造“黄金舰队”》57页报告

《打造“黄金舰队”》57页报告

专知会员服务

1+阅读 · 今天6:52

《北约数字教官网络发展路径》128页报告

《北约数字教官网络发展路径》128页报告

专知会员服务

1+阅读 · 今天6:33

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

专知会员服务

6+阅读 · 6月25日

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

专知会员服务

5+阅读 · 6月25日

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

专知会员服务

9+阅读 · 6月25日

网状网络及其在军事领域的运用

网状网络及其在军事领域的运用

专知会员服务

7+阅读 · 6月25日

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

专知会员服务

8+阅读 · 6月25日

无美国参与的欧洲战争方式（万字长文）

无美国参与的欧洲战争方式（万字长文）

专知会员服务

8+阅读 · 6月25日

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

专知会员服务

10+阅读 · 6月25日

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

专知会员服务

9+阅读 · 6月25日

《国防领域敏感性分析白皮书》

《国防领域敏感性分析白皮书》

专知会员服务

9+阅读 · 6月25日

综述 | 从问答到任务完成：Agent系统与Harness设计

综述 | 从问答到任务完成：Agent系统与Harness设计

专知会员服务

10+阅读 · 6月24日

Agentic RL：框架、实践与长程智能体训练

Agentic RL：框架、实践与长程智能体训练

专知会员服务

10+阅读 · 6月24日

相关VIP内容

ICLR | 训练面向分子模拟的十亿级参数 GNN

ICLR | 训练面向分子模拟的十亿级参数 GNN

专知会员服务

8+阅读 · 2022年6月27日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

128+阅读 · 2022年4月21日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【清华大学】Delta调优:预训练语言模型参数有效方法的综合研究，Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

【清华大学】Delta调优:预训练语言模型参数有效方法的综合研究，Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

专知会员服务

26+阅读 · 2022年3月15日

【2021新书】并行高性能计算，705页pdf，Parallel and High Performance Computing

【2021新书】并行高性能计算，705页pdf，Parallel and High Performance Computing

专知会员服务

108+阅读 · 2021年10月30日

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

专知会员服务

49+阅读 · 2020年5月26日

【2020新书】Python大数据处理，Mastering Large Datasets with Python

【2020新书】Python大数据处理，Mastering Large Datasets with Python

专知会员服务

54+阅读 · 2020年2月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

80+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

巡飞弹与反无人机系统——现代战场的两大支柱

《北约数字教官网络发展路径》128页报告

无人机自主控制与人工智能：系统性综述

《打造“黄金舰队”》57页报告

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

大数据 | 顶级SCI期刊专刊/国际会议信息7条

大数据 | 顶级SCI期刊专刊/国际会议信息7条

Call4Papers

10+阅读 · 2018年12月29日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

专知

26+阅读 · 2018年5月22日

相关论文

Bayesian and Multi-Armed Contextual Meta-Optimization for Efficient Wireless Radio Resource Management

Arxiv

0+阅读 · 2023年5月19日

Transfer operators on graphs: Spectral clustering and beyond

Arxiv

0+阅读 · 2023年5月19日

Faster Parallel Exact Density Peaks Clustering

Arxiv

0+阅读 · 2023年5月18日

CaRE: Finding Root Causes of Configuration Issues in Highly-Configurable Robots

Arxiv

0+阅读 · 2023年5月18日

Optimization of body configuration and joint-driven attitude stabilization for transformable spacecrafts under solar radiation pressure

Arxiv

0+阅读 · 2023年5月18日

Mapi-Pro: An Energy Efficient Memory Mapping Technique for Intermittent Computing

Arxiv

0+阅读 · 2023年5月17日

Robust Power Allocation for Integrated Visible Light Positioning and Communication Networks

Arxiv

0+阅读 · 2023年5月17日

State machines for large scale computer software and systems

Arxiv

0+阅读 · 2023年5月16日

Spectral Clustering via Orthogonalization-Free Methods

Arxiv

0+阅读 · 2023年5月16日

MATCH: Metadata-Aware Text Classification in A Large Hierarchy

Arxiv

12+阅读 · 2021年2月15日

相关基金

再生水分目标回用过程能量流评估与优化的仿真方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

风电波动下电力系统随机小干扰稳定性分析

国家自然科学基金

0+阅读 · 2014年12月31日

面向非线性非高斯数据的因果结构学习算法研究

国家自然科学基金

2+阅读 · 2013年12月31日

in silico生物分子网络动力学参数高速与高精度自动化估计的研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向动态规划计算的并行编程模型和运行时系统研究

国家自然科学基金

0+阅读 · 2013年12月31日

GaN基功率器件基础问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

复杂大化工过程的分布式广义预测控制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于热-机-电耦合模型的大型动力电池性能优化研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向IMT-A的femtocell绿色自组织关键问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

航线网络复杂性和统一优化模型研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员