Matching Linear Algebra and Tensor Code to Specialized Hardware Accelerators - 专知论文

会员服务 ·

0

线性的 · Tensor · Performer · 代码 · 易处理的 ·

2023 年 1 月 31 日

Matching Linear Algebra and Tensor Code to Specialized Hardware Accelerators

翻译：线性代数与张量代码到专用硬件加速器的匹配

Pablo Antonio Martínez,Jackson Woodruff,Jordi Armengol-Estapé,Gregorio Bernabé,José Manuel García,Michael F. P. O'Boyle

from arxiv, This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in Proceedings of the 32nd ACM SIGPLAN International Conference on Compiler Construction (CC '23), February 25-26, 2023, Montr\'eal, QC, Canada, https://doi.org/10.1145/3578360.3580262

Dedicated tensor accelerators demonstrate the importance of linear algebra in modern applications. Such accelerators have the potential for impressive performance gains, but require programmers to rewrite code using vendor APIs - a barrier to wider scale adoption. Recent work overcomes this by matching and replacing patterns within code, but such approaches are fragile and fail to cope with the diversity of real-world codes. We develop ATC, a compiler that uses program synthesis to map regions of code to specific APIs. The mapping space that ATC explores is combinatorially large, requiring the development of program classification, dynamic analysis, variable constraint generation and lexical distance matching techniques to make it tractable. We apply ATC to real-world tensor and linear algebra codes and evaluate them against four state-of-the-art approaches. We accelerate between 2.6x and 7x more programs, leading to over an order of magnitude performance improvement.

翻译：专用张量加速器突显了线性代数在现代应用中的重要性。此类加速器具有实现显著性能提升的潜力，但要求程序员使用供应商API重写代码——这成为大规模推广的障碍。近期研究通过匹配和替换代码中的模式来克服这一障碍，但这些方法较为脆弱，难以应对真实世界代码的多样性。我们开发了ATC编译器，它利用程序综合技术将代码片段映射到特定API。ATC探索的映射空间呈组合爆炸式增长，为此我们开发了程序分类、动态分析、变量约束生成以及词法距离匹配等技术，使其变得可处理。我们将ATC应用于真实的张量和线性代数代码，并与四种最先进的方法进行对比评估。ATC能够加速多出2.6倍到7倍的程序数量，从而带来超过一个数量级的性能提升。

0

相关内容

线性的

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【2020新书】Python Pro专业实践原则，Practices of the Python Pro，250页pdf

【2020新书】Python Pro专业实践原则，Practices of the Python Pro，250页pdf

专知会员服务

153+阅读 · 2020年1月25日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

斯坦福大学Fall 2018课程-机器学习硬件加速器( 附PPT下载)

斯坦福大学Fall 2018课程-机器学习硬件加速器( 附PPT下载)

专知

18+阅读 · 2018年7月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Copine VII在阿尔茨海默病中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

阿司匹林乙酰化修饰ALDH1选择性清除大肠癌干细胞的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

蛋白激酶CyclinE/CDK2调控去甲基化酶PHF8的机制及生物学意义

国家自然科学基金

0+阅读 · 2014年12月31日

钛表面微弧氧化HA/TiO2层上缓释型BMP-2/壳聚糖的负载及成骨机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

针刺调节术后ICCs修复再生微环境的作用及其microRNA调控

国家自然科学基金

0+阅读 · 2012年12月31日

MiR-449介导KDM4C-Notch通路在三阴性乳腺癌增殖转移中的调控研究

国家自然科学基金

0+阅读 · 2012年12月31日

β-Sarcoglycan在mSOD1介导ALS骨骼肌病变中的机制研究

国家自然科学基金

1+阅读 · 2012年12月31日

脱－γ－羧基凝血酶原(Des-γ-carboxyl prothrombin DCP)促进肝癌恶性增殖与转移作用研究

国家自然科学基金

0+阅读 · 2011年12月31日

Bose-Hubbard模型量子相变的数值研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

Exponential Consistency of M-estimators in Generalized Linear Mixed Models

Arxiv

0+阅读 · 2023年3月22日

$L_2$BN: Enhancing Batch Normalization by Equalizing the $L_2$ Norms of Features

Arxiv

0+阅读 · 2023年3月21日

Abstract Visual Reasoning: An Algebraic Approach for Solving Raven's Progressive Matrices

Arxiv

0+阅读 · 2023年3月21日

Enumerating Independent Linear Inferences

Arxiv

0+阅读 · 2023年3月21日

Going faster to see further: GPU-accelerated value iteration and simulation for perishable inventory control using JAX

Arxiv

0+阅读 · 2023年3月19日

Popular Critical Matchings in the Many-to-Many Setting

Arxiv

0+阅读 · 2023年3月19日

Efficient deadlock avoidance in 2D mesh NoCs that use OQ or VOQ routers

Arxiv

0+阅读 · 2023年3月19日

A Combinatorial Cut-Toggling Algorithm for Solving Laplacian Linear Systems

Arxiv

0+阅读 · 2023年3月16日

On games and simulators as a platform for development of artificial intelligence for command and control

On games and simulators as a platform for development of artificial intelligence for command and control

Arxiv

90+阅读 · 2021年10月21日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

VIP会员

文章信息

相关主题

最新内容

《基于智能体建模与仿真的无人机蜂群模型目标定位涌现行为比较分析》360页

《基于智能体建模与仿真的无人机蜂群模型目标定位涌现行为比较分析》360页

专知会员服务

8+阅读 · 7月18日

欧洲智能弹药战略创新管理：迈向制导弹药、巡飞系统与自主无人机蜂群的技术主权研究路线图

欧洲智能弹药战略创新管理：迈向制导弹药、巡飞系统与自主无人机蜂群的技术主权研究路线图

专知会员服务

6+阅读 · 7月18日

从领域适配到部署与可解释：Berkeley博士论文解析大语言模型真实落地

从领域适配到部署与可解释：Berkeley博士论文解析大语言模型真实落地

专知会员服务

8+阅读 · 7月18日

综述 | 长程智能体研究全景：基础、演化、框架、优化与前沿

综述 | 长程智能体研究全景：基础、演化、框架、优化与前沿

专知会员服务

5+阅读 · 7月18日

DARPA拟打造十万规模自主思考作战的AI智能体集群：“受控涌现式分布式人工智能”（DICE）项目

DARPA拟打造十万规模自主思考作战的AI智能体集群：“受控涌现式分布式人工智能”（DICE）项目

专知会员服务

8+阅读 · 7月17日

《边缘端实时无线感知赋能现场多机器人部署》200页

《边缘端实时无线感知赋能现场多机器人部署》200页

专知会员服务

8+阅读 · 7月17日

战力倍增器：自主武器系统与乌克兰及加沙冲突

战力倍增器：自主武器系统与乌克兰及加沙冲突

专知会员服务

4+阅读 · 7月17日

人工智能赋能战场情报：提速决策进程

人工智能赋能战场情报：提速决策进程

专知会员服务

2+阅读 · 7月17日

《拥抱新兴技术：面向未来军官的教育革新》

《拥抱新兴技术：面向未来军官的教育革新》

专知会员服务

6+阅读 · 7月17日

ACM MM 2026 | MAR-GRPO：稳定混合图像生成的强化学习训练

ACM MM 2026 | MAR-GRPO：稳定混合图像生成的强化学习训练

专知会员服务

4+阅读 · 7月17日

综述 | 大模型水印理论与部署：来源追踪、攻击鲁棒与可信治理

综述 | 大模型水印理论与部署：来源追踪、攻击鲁棒与可信治理

专知会员服务

5+阅读 · 7月17日

《火线上的后勤保障：对抗环境下的随机规划模型研究——俄乌场景案例分析》99页

《火线上的后勤保障：对抗环境下的随机规划模型研究——俄乌场景案例分析》99页

专知会员服务

13+阅读 · 7月16日

《无人地面战车（UGV）的崛起》报告

《无人地面战车（UGV）的崛起》报告

专知会员服务

8+阅读 · 7月16日

《无人机参数化与集群飞行创新项目的监控流程管理：模型、策略及自适应解决方案》

《无人机参数化与集群飞行创新项目的监控流程管理：模型、策略及自适应解决方案》

专知会员服务

7+阅读 · 7月16日

《美军开放式任务系统（OMS）定义与文档（D&D）——Java关键抽象层（CAL）接口生成规范》47页标准

《美军开放式任务系统（OMS）定义与文档（D&D）——Java关键抽象层（CAL）接口生成规范》47页标准

专知会员服务

15+阅读 · 7月16日

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【2020新书】Python Pro专业实践原则，Practices of the Python Pro，250页pdf

【2020新书】Python Pro专业实践原则，Practices of the Python Pro，250页pdf

专知会员服务

153+阅读 · 2020年1月25日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

欧洲智能弹药战略创新管理：迈向制导弹药、巡飞系统与自主无人机蜂群的技术主权研究路线图

综述 | 长程智能体研究全景：基础、演化、框架、优化与前沿

《基于智能体建模与仿真的无人机蜂群模型目标定位涌现行为比较分析》360页

从领域适配到部署与可解释：Berkeley博士论文解析大语言模型真实落地

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

斯坦福大学Fall 2018课程-机器学习硬件加速器( 附PPT下载)

斯坦福大学Fall 2018课程-机器学习硬件加速器( 附PPT下载)

专知

18+阅读 · 2018年7月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Exponential Consistency of M-estimators in Generalized Linear Mixed Models

Arxiv

0+阅读 · 2023年3月22日

$L_2$BN: Enhancing Batch Normalization by Equalizing the $L_2$ Norms of Features

Arxiv

0+阅读 · 2023年3月21日

Abstract Visual Reasoning: An Algebraic Approach for Solving Raven's Progressive Matrices

Arxiv

0+阅读 · 2023年3月21日

Enumerating Independent Linear Inferences

Arxiv

0+阅读 · 2023年3月21日

Going faster to see further: GPU-accelerated value iteration and simulation for perishable inventory control using JAX

Arxiv

0+阅读 · 2023年3月19日

Popular Critical Matchings in the Many-to-Many Setting

Arxiv

0+阅读 · 2023年3月19日

Efficient deadlock avoidance in 2D mesh NoCs that use OQ or VOQ routers

Arxiv

0+阅读 · 2023年3月19日

A Combinatorial Cut-Toggling Algorithm for Solving Laplacian Linear Systems

Arxiv

0+阅读 · 2023年3月16日

On games and simulators as a platform for development of artificial intelligence for command and control

On games and simulators as a platform for development of artificial intelligence for command and control

Arxiv

90+阅读 · 2021年10月21日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

相关基金

Copine VII在阿尔茨海默病中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

阿司匹林乙酰化修饰ALDH1选择性清除大肠癌干细胞的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

蛋白激酶CyclinE/CDK2调控去甲基化酶PHF8的机制及生物学意义

国家自然科学基金

0+阅读 · 2014年12月31日

钛表面微弧氧化HA/TiO2层上缓释型BMP-2/壳聚糖的负载及成骨机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

针刺调节术后ICCs修复再生微环境的作用及其microRNA调控

国家自然科学基金

0+阅读 · 2012年12月31日

MiR-449介导KDM4C-Notch通路在三阴性乳腺癌增殖转移中的调控研究

国家自然科学基金

0+阅读 · 2012年12月31日

β-Sarcoglycan在mSOD1介导ALS骨骼肌病变中的机制研究

国家自然科学基金

1+阅读 · 2012年12月31日

脱－γ－羧基凝血酶原(Des-γ-carboxyl prothrombin DCP)促进肝癌恶性增殖与转移作用研究

国家自然科学基金

0+阅读 · 2011年12月31日

Bose-Hubbard模型量子相变的数值研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员