引入指令级精确模拟器用于自动调优工作负载的性能评估 (Introducing Instruction-Accurate Simulators for Performance Estimation of Autotuning Workloads) - 专知论文

会员服务 ·

0

负载 · 自动调优 · 性能评估 · ML · 并行 ·

Introducing Instruction-Accurate Simulators for Performance Estimation of Autotuning Workloads

翻译：引入指令级精确模拟器用于自动调优工作负载的性能评估

Rebecca Pelke,Nils Bosbach,Lennart M. Reimann,Rainer Leupers

Accelerating Machine Learning (ML) workloads requires efficient methods due to their large optimization space. Autotuning has emerged as an effective approach for systematically evaluating variations of implementations. Traditionally, autotuning requires the workloads to be executed on the target hardware (HW). We present an interface that allows executing autotuning workloads on simulators. This approach offers high scalability when the availability of the target HW is limited, as many simulations can be run in parallel on any accessible HW. Additionally, we evaluate the feasibility of using fast instruction-accurate simulators for autotuning. We train various predictors to forecast the performance of ML workload implementations on the target HW based on simulation statistics. Our results demonstrate that the tuned predictors are highly effective. The best workload implementation in terms of actual run time on the target HW is always within the top 3 % of predictions for the tested x86, ARM, and RISC-V-based architectures. In the best case, this approach outperforms native execution on the target HW for embedded architectures when running as few as three samples on three simulators in parallel.

翻译：加速机器学习（ML）工作负载因其庞大的优化空间而需要高效方法。自动调优已成为系统评估实现变体的有效途径。传统上，自动调优需要在目标硬件（HW）上执行工作负载。我们提出一种接口，允许在模拟器上执行自动调优工作负载。该方法在目标硬件可用性有限时提供高可扩展性，因为大量模拟可在任何可访问的硬件上并行运行。此外，我们评估了使用快速指令级精确模拟器进行自动调优的可行性。我们训练了多种预测器，基于模拟统计数据来预测ML工作负载实现在目标硬件上的性能。我们的结果表明，经调优的预测器非常有效。就目标硬件上的实际运行时间而言，最佳工作负载实现始终位于预测结果的前3%以内（针对测试的x86、ARM和基于RISC-V的架构）。在最佳情况下，当在三个模拟器上并行运行少至三个样本时，该方法在嵌入式架构上的表现优于目标硬件的本地执行。

0

相关内容

【慕尼黑大学博士论文】可解释自动化机器学习，200页pdf

【慕尼黑大学博士论文】可解释自动化机器学习，200页pdf

专知会员服务

41+阅读 · 2023年12月17日

《评估人工智能和辅助自动化指挥与控制决策辅助工具以提高任务效率的分析框架》

《评估人工智能和辅助自动化指挥与控制决策辅助工具以提高任务效率的分析框架》

专知会员服务

137+阅读 · 2023年7月10日

博士论文《联邦学习仿真器》221页，米兰理工大学

博士论文《联邦学习仿真器》221页，米兰理工大学

专知会员服务

31+阅读 · 2023年3月14日

【2023新书】机器和深度学习的超参数调优实用指南，327页pdf

【2023新书】机器和深度学习的超参数调优实用指南，327页pdf

专知会员服务

53+阅读 · 2023年3月12日

【2023新书】基于R的机器和深度学习超参数调优实用指南

【2023新书】基于R的机器和深度学习超参数调优实用指南

专知会员服务

65+阅读 · 2023年1月22日

《人类与自动机器学习系统交互的角色和模式：综述与展望》98页长综述论文（2022），悉尼科技大学

《人类与自动机器学习系统交互的角色和模式：综述与展望》98页长综述论文（2022），悉尼科技大学

专知会员服务

64+阅读 · 2022年10月28日

最新《自动化机器学习》报告，73页ppt建模阐述AutoML进展，附书籍

最新《自动化机器学习》报告，73页ppt建模阐述AutoML进展，附书籍

专知会员服务

114+阅读 · 2022年8月26日

【Google大脑Mangpo】自动调优生产机器学习编译器

【Google大脑Mangpo】自动调优生产机器学习编译器

专知会员服务

14+阅读 · 2022年7月6日

【综述】超参数优化:算法和应用综述，Hyper-Parameter Optimization: A Review of Algorithms and Applications

【综述】超参数优化:算法和应用综述，Hyper-Parameter Optimization: A Review of Algorithms and Applications

专知会员服务

57+阅读 · 2020年3月13日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

多模态怎么用自监督？爱丁堡等最新《自监督多模态学习》综述，详述目标函数、数据对齐和模型架构

多模态怎么用自监督？爱丁堡等最新《自监督多模态学习》综述，详述目标函数、数据对齐和模型架构

专知

10+阅读 · 2023年4月6日

【干货书】MLOps是什么？MLOps实战：操作机器学习模型，461页pdf

【干货书】MLOps是什么？MLOps实战：操作机器学习模型，461页pdf

专知

15+阅读 · 2022年2月16日

【新书】机器学习算法，模型与应用，154页pdf

【新书】机器学习算法，模型与应用，154页pdf

专知

24+阅读 · 2022年1月20日

【加州理工】什么是模仿学习(Imitation Learning（模仿学习), 这62页ppt带你了解进展，附下载

【加州理工】什么是模仿学习(Imitation Learning（模仿学习), 这62页ppt带你了解进展，附下载

专知

21+阅读 · 2019年11月14日

以BERT为例,如何优化机器学习模型性能?

以BERT为例,如何优化机器学习模型性能?

专知

10+阅读 · 2019年10月3日

概述自动机器学习（AutoML）

概述自动机器学习（AutoML）

人工智能学家

19+阅读 · 2019年8月11日

深度学习中Attention Mechanism详细介绍：原理、分类及应用

深度学习中Attention Mechanism详细介绍：原理、分类及应用

深度学习与NLP

10+阅读 · 2019年2月18日

手动特征工程已经OUT了！自动特征工程才是改进机器学习的方式

手动特征工程已经OUT了！自动特征工程才是改进机器学习的方式

AI100

11+阅读 · 2018年9月4日

模型汇总24 - 深度学习中Attention Mechanism详细介绍：原理、分类及应用

模型汇总24 - 深度学习中Attention Mechanism详细介绍：原理、分类及应用

深度学习与NLP

12+阅读 · 2017年11月30日

自然语言处理中的Attention Model：是什么及为什么

自然语言处理中的Attention Model：是什么及为什么

新智元

11+阅读 · 2017年7月13日

家庭智能用电任务调度优化及其对电网负荷影响分析模型

国家自然科学基金

1+阅读 · 2015年12月31日

面向估计性能优化的网络化控制系统传感器调度

国家自然科学基金

0+阅读 · 2015年12月31日

大功率柔顺驱动器的设计方法及能量优化和交互安全机理研究

国家自然科学基金

1+阅读 · 2015年12月31日

数控机床复杂工况下多层次多自由度静动态载荷谱关键技术研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于动态匹配的高能量利用率多层堆叠结构静态随机存储器（SRAM）关键技术

国家自然科学基金

0+阅读 · 2015年12月31日

基于神经网络和强化学习的车辆装配系统中的多载量小车实时调度方法

国家自然科学基金

4+阅读 · 2014年12月31日

基于深度学习的机器译文质量估计方法研究

国家自然科学基金

3+阅读 · 2014年12月31日

千万自由度量级并行有限元模态和振动分析软件研发

国家自然科学基金

0+阅读 · 2014年12月31日

考虑多控制模式融合的互动式负荷调度方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

压电智能作动器的高保真完整非线性动力学建模和高精度多通道运动协同同步控制系统一体化优化设计

国家自然科学基金

0+阅读 · 2014年12月31日

Evaluating Malleable Job Scheduling in HPC Clusters using Real-World Workloads

Arxiv

0+阅读 · 2月19日

The Turbo-Charged Mapper: Fast and Optimal Mapping for Accelerator Modeling and Evaluation

Arxiv

0+阅读 · 2月16日

Fast and Fusiest: An Optimal Fusion-Aware Mapper for Accelerator Modeling and Evaluation

Arxiv

0+阅读 · 2月16日

Messaging-based Adaptive Vector Computing (MAVeC) Accelerator for AI Workloads

Arxiv

0+阅读 · 2月4日

AutoOverlap: Enabling Fine-Grained Overlap of Computation and Communication with Chunk-Based Scheduling

Arxiv

0+阅读 · 1月28日

Fine Tuning a Simulation-Driven Estimator

Arxiv

0+阅读 · 1月27日

Optimising for Energy Efficiency and Performance in Machine Learning

Arxiv

0+阅读 · 1月22日

A Two-Stage GPU Kernel Tuner Combining Semantic Refactoring and Search-Based Optimization

Arxiv

0+阅读 · 1月21日

A Tool for Automatically Cataloguing and Selecting Pre-Trained Models and Datasets for Software Engineering

Arxiv

0+阅读 · 1月19日

Optimising for Energy Efficiency and Performance in Machine Learning

Arxiv

0+阅读 · 1月13日

VIP会员

文章信息

相关主题

相关VIP内容

【慕尼黑大学博士论文】可解释自动化机器学习，200页pdf

【慕尼黑大学博士论文】可解释自动化机器学习，200页pdf

专知会员服务

41+阅读 · 2023年12月17日

《评估人工智能和辅助自动化指挥与控制决策辅助工具以提高任务效率的分析框架》

《评估人工智能和辅助自动化指挥与控制决策辅助工具以提高任务效率的分析框架》

专知会员服务

137+阅读 · 2023年7月10日

博士论文《联邦学习仿真器》221页，米兰理工大学

博士论文《联邦学习仿真器》221页，米兰理工大学

专知会员服务

31+阅读 · 2023年3月14日

【2023新书】机器和深度学习的超参数调优实用指南，327页pdf

【2023新书】机器和深度学习的超参数调优实用指南，327页pdf

专知会员服务

53+阅读 · 2023年3月12日

【2023新书】基于R的机器和深度学习超参数调优实用指南

【2023新书】基于R的机器和深度学习超参数调优实用指南

专知会员服务

65+阅读 · 2023年1月22日

《人类与自动机器学习系统交互的角色和模式：综述与展望》98页长综述论文（2022），悉尼科技大学

《人类与自动机器学习系统交互的角色和模式：综述与展望》98页长综述论文（2022），悉尼科技大学

专知会员服务

64+阅读 · 2022年10月28日

最新《自动化机器学习》报告，73页ppt建模阐述AutoML进展，附书籍

最新《自动化机器学习》报告，73页ppt建模阐述AutoML进展，附书籍

专知会员服务

114+阅读 · 2022年8月26日

【Google大脑Mangpo】自动调优生产机器学习编译器

【Google大脑Mangpo】自动调优生产机器学习编译器

专知会员服务

14+阅读 · 2022年7月6日

【综述】超参数优化:算法和应用综述，Hyper-Parameter Optimization: A Review of Algorithms and Applications

【综述】超参数优化:算法和应用综述，Hyper-Parameter Optimization: A Review of Algorithms and Applications

专知会员服务

57+阅读 · 2020年3月13日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

智能体记忆深度剖析：评价指标与系统局限性的分类体系及实证分析

《可信人工智能赋能系统的支柱》

【CMU博士论文】可靠轨迹预测的分层基石：数据、评估与方法

人工智能赋能边缘与自主系统：美陆军现代化进程聚焦威胁探测与战术边缘情报

相关资讯

多模态怎么用自监督？爱丁堡等最新《自监督多模态学习》综述，详述目标函数、数据对齐和模型架构

多模态怎么用自监督？爱丁堡等最新《自监督多模态学习》综述，详述目标函数、数据对齐和模型架构

专知

10+阅读 · 2023年4月6日

【干货书】MLOps是什么？MLOps实战：操作机器学习模型，461页pdf

【干货书】MLOps是什么？MLOps实战：操作机器学习模型，461页pdf

专知

15+阅读 · 2022年2月16日

【新书】机器学习算法，模型与应用，154页pdf

【新书】机器学习算法，模型与应用，154页pdf

专知

24+阅读 · 2022年1月20日

【加州理工】什么是模仿学习(Imitation Learning（模仿学习), 这62页ppt带你了解进展，附下载

【加州理工】什么是模仿学习(Imitation Learning（模仿学习), 这62页ppt带你了解进展，附下载

专知

21+阅读 · 2019年11月14日

以BERT为例,如何优化机器学习模型性能?

以BERT为例,如何优化机器学习模型性能?

专知

10+阅读 · 2019年10月3日

概述自动机器学习（AutoML）

概述自动机器学习（AutoML）

人工智能学家

19+阅读 · 2019年8月11日

深度学习中Attention Mechanism详细介绍：原理、分类及应用

深度学习中Attention Mechanism详细介绍：原理、分类及应用

深度学习与NLP

10+阅读 · 2019年2月18日

手动特征工程已经OUT了！自动特征工程才是改进机器学习的方式

手动特征工程已经OUT了！自动特征工程才是改进机器学习的方式

AI100

11+阅读 · 2018年9月4日

模型汇总24 - 深度学习中Attention Mechanism详细介绍：原理、分类及应用

模型汇总24 - 深度学习中Attention Mechanism详细介绍：原理、分类及应用

深度学习与NLP

12+阅读 · 2017年11月30日

自然语言处理中的Attention Model：是什么及为什么

自然语言处理中的Attention Model：是什么及为什么

新智元

11+阅读 · 2017年7月13日

相关论文

Evaluating Malleable Job Scheduling in HPC Clusters using Real-World Workloads

Arxiv

0+阅读 · 2月19日

The Turbo-Charged Mapper: Fast and Optimal Mapping for Accelerator Modeling and Evaluation

Arxiv

0+阅读 · 2月16日

Fast and Fusiest: An Optimal Fusion-Aware Mapper for Accelerator Modeling and Evaluation

Arxiv

0+阅读 · 2月16日

Messaging-based Adaptive Vector Computing (MAVeC) Accelerator for AI Workloads

Arxiv

0+阅读 · 2月4日

AutoOverlap: Enabling Fine-Grained Overlap of Computation and Communication with Chunk-Based Scheduling

Arxiv

0+阅读 · 1月28日

Fine Tuning a Simulation-Driven Estimator

Arxiv

0+阅读 · 1月27日

Optimising for Energy Efficiency and Performance in Machine Learning

Arxiv

0+阅读 · 1月22日

A Two-Stage GPU Kernel Tuner Combining Semantic Refactoring and Search-Based Optimization

Arxiv

0+阅读 · 1月21日

A Tool for Automatically Cataloguing and Selecting Pre-Trained Models and Datasets for Software Engineering

Arxiv

0+阅读 · 1月19日

Optimising for Energy Efficiency and Performance in Machine Learning

Arxiv

0+阅读 · 1月13日

相关基金

家庭智能用电任务调度优化及其对电网负荷影响分析模型

国家自然科学基金

1+阅读 · 2015年12月31日

面向估计性能优化的网络化控制系统传感器调度

国家自然科学基金

0+阅读 · 2015年12月31日

大功率柔顺驱动器的设计方法及能量优化和交互安全机理研究

国家自然科学基金

1+阅读 · 2015年12月31日

数控机床复杂工况下多层次多自由度静动态载荷谱关键技术研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于动态匹配的高能量利用率多层堆叠结构静态随机存储器（SRAM）关键技术

国家自然科学基金

0+阅读 · 2015年12月31日

基于神经网络和强化学习的车辆装配系统中的多载量小车实时调度方法

国家自然科学基金

4+阅读 · 2014年12月31日

基于深度学习的机器译文质量估计方法研究

国家自然科学基金

3+阅读 · 2014年12月31日

千万自由度量级并行有限元模态和振动分析软件研发

国家自然科学基金

0+阅读 · 2014年12月31日

考虑多控制模式融合的互动式负荷调度方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

压电智能作动器的高保真完整非线性动力学建模和高精度多通道运动协同同步控制系统一体化优化设计

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员