Hyperbolic Diffusion in Flux Reconstruction: Optimisation through Kernel Fusion within Tensor-Product Elements - 专知论文

会员服务 ·

0

核化 · 可约的 · ACM · 雅克比 · CASES ·

2021 年 11 月 15 日

Hyperbolic Diffusion in Flux Reconstruction: Optimisation through Kernel Fusion within Tensor-Product Elements

翻译：流体重建中的双曲扩散:通过内核熔化优化Tensor-Production 元素中的内核熔化

Will Trojak,Rob Watson,Freddie Witherden

Novel methods are presented in this initial study for the fusion of GPU kernels in the artificial compressibility method (ACM), using tensor product elements with constant Jacobians and flux reconstruction. This is made possible through the hyperbolisation of the diffusion terms, which eliminates the expensive algorithmic steps needed to form the viscous stresses. Two fusion approaches are presented, which offer differing levels of parallelism. This is found to be necessary for the change in workload as the order of accuracy of the elements is increased. Several further optimisations of these approaches are demonstrated, including a generation time memory manager which maximises resource usage. The fused kernels are able to achieve 3-4 times speedup, which compares favourably with a theoretical maximum speedup of 4. In three dimensional test cases, the generated fused kernels are found to reduce total runtime by ${\sim}25\%$, and, when compared to the standard ACM formulation, simulations demonstrate that a speedup of $2.3$ times can be achieved.

翻译：在本初步研究中提出了将GPU内核结合到人工压缩法(ACM)中的GPU内核的新方法,使用恒定的雅各布和通量重建的强压产品元素,通过超陈化扩散条件,消除了形成粘结压力所需的昂贵的算法步骤;提出了两种混合方法,提供不同水平的平行效应;认为这对工作量的变化是必要的,因为元素的精确度提高了;进一步优化了这些方法,包括使资源使用最大化的一代时间内存管理器;引信内核能够实现3-4倍的加速,而理论上的最大速度为4倍;在三维试验中,产生的引信内核被认为可以将总运行时间减少25,000美元;与标准ACM配制相比,模拟表明可以实现2.3美元的速度。

0

相关内容

【硬核书】矩阵代数基础，248页pdf

【硬核书】矩阵代数基础，248页pdf

专知会员服务

88+阅读 · 2021年12月9日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

专知会员服务

434+阅读 · 2021年1月11日

【斯坦福大学博士论文】大规模和高维统计学习方法和算法，147页pdf， Large-scale and high-dimensional statistical learning methods and algorithms

专知会员服务

26+阅读 · 2020年6月13日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

【斯坦福大学CS229】面向机器学习的线性代数和微积分要点速览(中文版)《CS 229 - Linear Algebra and Calculus refresher》by Afshine Amidi, Shervine Amidi

【斯坦福大学CS229】面向机器学习的线性代数和微积分要点速览(中文版)《CS 229 - Linear Algebra and Calculus refresher》by Afshine Amidi, Shervine Amidi

专知会员服务

198+阅读 · 2019年12月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

10+阅读 · 2019年8月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

LibRec 精选：基于参数共享的CNN-RNN混合模型

LibRec 精选：基于参数共享的CNN-RNN混合模型

LibRec智能推荐

6+阅读 · 2019年3月7日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

语音顶级会议Interspeech2018接受论文列表！

语音顶级会议Interspeech2018接受论文列表！

专知

6+阅读 · 2018年6月10日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

【推荐】免费书(草稿)：数据科学的数学基础

【推荐】免费书(草稿)：数据科学的数学基础

机器学习研究会

20+阅读 · 2017年10月1日

Optimum GSSK Transmission in Massive MIMO Systems Using the Box-LASSO Decoder

Arxiv

0+阅读 · 2022年1月18日

Generalized sparse Bayesian learning and application to image reconstruction

Generalized sparse Bayesian learning and application to image reconstruction

Arxiv

0+阅读 · 2022年1月18日

Carleman estimates and the contraction principle for an inverse source problem for nonlinear hyperbolic equation

Carleman estimates and the contraction principle for an inverse source problem for nonlinear hyperbolic equation

Arxiv

0+阅读 · 2022年1月18日

Multitask Learning for Grapheme-to-Phoneme Conversion of Anglicisms in German Speech Recognition

Arxiv

0+阅读 · 2022年1月18日

Multilevel quasi-Monte Carlo for random elliptic eigenvalue problems II: Efficient algorithms and numerical results

Arxiv

0+阅读 · 2022年1月17日

Multilevel quasi-Monte Carlo for random elliptic eigenvalue problems I: Regularity and error analysis

Arxiv

0+阅读 · 2022年1月17日

A posteriori error analysis for a space-time parallel discretization of parabolic partial differential equations

Arxiv

0+阅读 · 2022年1月14日

Data Fusion with Latent Map Gaussian Processes

Arxiv

0+阅读 · 2022年1月13日

Diffusion Improves Graph Learning

Arxiv

6+阅读 · 2019年11月14日

LPCNet: Improving Neural Speech Synthesis Through Linear Prediction

Arxiv

3+阅读 · 2018年10月28日

VIP会员

文章信息

相关主题

最新内容

2026“人工智能+”行业发展蓝皮书（附下载）

2026“人工智能+”行业发展蓝皮书（附下载）

专知会员服务

1+阅读 · 18分钟前

《强化学习数学基础》

《强化学习数学基础》

专知会员服务

1+阅读 · 22分钟前

何为下一代指挥与控制？美陆军选择第四步兵师进行快速原型NGC2开发

何为下一代指挥与控制？美陆军选择第四步兵师进行快速原型NGC2开发

专知会员服务

1+阅读 · 今天10:06

《低成本自杀式无人机战争的军事战略影响：以乌克兰和伊朗为案例研究》

《低成本自杀式无人机战争的军事战略影响：以乌克兰和伊朗为案例研究》

专知会员服务

1+阅读 · 今天9:11

深入Maven智能系统：Palantir基于Claude打造的军事大脑

深入Maven智能系统：Palantir基于Claude打造的军事大脑

专知会员服务

6+阅读 · 今天8:18

“Maven计划”的发展演变之“Maven智能系统”应用

“Maven计划”的发展演变之“Maven智能系统”应用

专知会员服务

4+阅读 · 今天8:03

伊朗的无人机蜂群策略如何挑战美国防御系统：人工智能驱动的无人机战争与现代冲突的转型

伊朗的无人机蜂群策略如何挑战美国防御系统：人工智能驱动的无人机战争与现代冲突的转型

专知会员服务

5+阅读 · 今天7:39

《将小型无人机系统与巡飞弹集成至连及以下级别战术机动》（美陆军最新报告中文版）

《将小型无人机系统与巡飞弹集成至连及以下级别战术机动》（美陆军最新报告中文版）

专知会员服务

4+阅读 · 今天6:58

加拿大国防部发布项目需求：用于高级态势决策的多模态人工智能

加拿大国防部发布项目需求：用于高级态势决策的多模态人工智能

专知会员服务

3+阅读 · 今天6:54

《无人机革命：来自俄乌战场的启示》（报告）

《无人机革命：来自俄乌战场的启示》（报告）

专知会员服务

5+阅读 · 今天6:48

《实现联合作战能力所需的技术》58页报告

《实现联合作战能力所需的技术》58页报告

专知会员服务

2+阅读 · 今天6:30

《算法化目标定位：人工智能在以色列加沙打击行动中的作用及其伦理影响》（中文版）

《算法化目标定位：人工智能在以色列加沙打击行动中的作用及其伦理影响》（中文版）

专知会员服务

5+阅读 · 今天6:22

以色列运用人工智能优化空袭警报系统

以色列运用人工智能优化空袭警报系统

专知会员服务

3+阅读 · 今天6:20

以色列在多条战线部署AI智能体

以色列在多条战线部署AI智能体

专知会员服务

4+阅读 · 今天6:12

《将形式化方法工具应用于电子战代码库（经验报告）》

《将形式化方法工具应用于电子战代码库（经验报告）》

专知会员服务

4+阅读 · 今天6:09

相关VIP内容

【硬核书】矩阵代数基础，248页pdf

【硬核书】矩阵代数基础，248页pdf

专知会员服务

88+阅读 · 2021年12月9日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

专知会员服务

434+阅读 · 2021年1月11日

【斯坦福大学博士论文】大规模和高维统计学习方法和算法，147页pdf， Large-scale and high-dimensional statistical learning methods and algorithms

专知会员服务

26+阅读 · 2020年6月13日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

【斯坦福大学CS229】面向机器学习的线性代数和微积分要点速览(中文版)《CS 229 - Linear Algebra and Calculus refresher》by Afshine Amidi, Shervine Amidi

【斯坦福大学CS229】面向机器学习的线性代数和微积分要点速览(中文版)《CS 229 - Linear Algebra and Calculus refresher》by Afshine Amidi, Shervine Amidi

专知会员服务

198+阅读 · 2019年12月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《强化学习数学基础》

《低成本自杀式无人机战争的军事战略影响：以乌克兰和伊朗为案例研究》

2026“人工智能+”行业发展蓝皮书（附下载）

何为下一代指挥与控制？美陆军选择第四步兵师进行快速原型NGC2开发

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

10+阅读 · 2019年8月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

LibRec 精选：基于参数共享的CNN-RNN混合模型

LibRec 精选：基于参数共享的CNN-RNN混合模型

LibRec智能推荐

6+阅读 · 2019年3月7日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

语音顶级会议Interspeech2018接受论文列表！

语音顶级会议Interspeech2018接受论文列表！

专知

6+阅读 · 2018年6月10日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

【推荐】免费书(草稿)：数据科学的数学基础

【推荐】免费书(草稿)：数据科学的数学基础

机器学习研究会

20+阅读 · 2017年10月1日

相关论文

Optimum GSSK Transmission in Massive MIMO Systems Using the Box-LASSO Decoder

Arxiv

0+阅读 · 2022年1月18日

Generalized sparse Bayesian learning and application to image reconstruction

Generalized sparse Bayesian learning and application to image reconstruction

Arxiv

0+阅读 · 2022年1月18日

Carleman estimates and the contraction principle for an inverse source problem for nonlinear hyperbolic equation

Carleman estimates and the contraction principle for an inverse source problem for nonlinear hyperbolic equation

Arxiv

0+阅读 · 2022年1月18日

Multitask Learning for Grapheme-to-Phoneme Conversion of Anglicisms in German Speech Recognition

Arxiv

0+阅读 · 2022年1月18日

Multilevel quasi-Monte Carlo for random elliptic eigenvalue problems II: Efficient algorithms and numerical results

Arxiv

0+阅读 · 2022年1月17日

Multilevel quasi-Monte Carlo for random elliptic eigenvalue problems I: Regularity and error analysis

Arxiv

0+阅读 · 2022年1月17日

A posteriori error analysis for a space-time parallel discretization of parabolic partial differential equations

Arxiv

0+阅读 · 2022年1月14日

Data Fusion with Latent Map Gaussian Processes

Arxiv

0+阅读 · 2022年1月13日

Diffusion Improves Graph Learning

Arxiv

6+阅读 · 2019年11月14日

LPCNet: Improving Neural Speech Synthesis Through Linear Prediction

Arxiv

3+阅读 · 2018年10月28日

微信扫码咨询专知VIP会员