Accelerating Range Minimum Queries with Ray Tracing Cores - 专知论文

会员服务 ·

0

Ray · Performer · 极小点 · 值域 · 迹 ·

2023 年 6 月 5 日

Accelerating Range Minimum Queries with Ray Tracing Cores

翻译：利用光线追踪核心加速范围最小值查询

Enzo Meneses,Cristóbal A. Navarro,Héctor Ferrada,Felipe A. Quezada

from arxiv, 17 Figures

During the last decade GPU technology has shifted from pure general purpose computation to the inclusion of application specific integrated circuits (ASICs), such as Tensor Cores and Ray Tracing (RT) cores. Although these special purpose GPU cores were designed to further accelerate specific fields such as AI and real-time rendering, recent research has managed to exploit them to further accelerate other tasks that typically used regular GPU computing. In this work we present RTXRMQ, a new approach that can compute range minimum queries (RMQs) with RT cores. The main contribution is the proposal of a geometric solution for RMQ, where elements become triangles that are placed and shaped according to the element's value and position in the array, respectively, such that the closest hit of a ray launched from a point given by the query parameters corresponds to the result of that query. Experimental results show that RTXRMQ is currently best suited for small query ranges relative to the problem size, achieving up to $5\times$ and $2.3\times$ of speedup over state of the art CPU (HRMQ) and GPU (LCA) approaches, respectively. Although for medium and large query ranges RTXRMQ is currently surpassed by LCA, it is still competitive by being $2.5\times$ and $4\times$ faster than HRMQ which is a highly parallel CPU approach. Furthermore, performance scaling experiments across the latest RTX GPU architectures show that if the current RT scaling trend continues, then RTXRMQ's performance would scale at a higher rate than HRMQ and LCA, making the approach even more relevant for future high performance applications that employ batches of RMQs.

翻译：过去十年间，GPU技术已从纯粹通用计算转向集成专用集成电路（ASIC），例如张量核心（Tensor Cores）和光线追踪（RT）核心。尽管这些专用GPU核心旨在进一步加速AI和实时渲染等特定领域，但近期研究已成功利用它们来加速通常使用常规GPU计算的其他任务。本文提出RTXRMQ，一种利用RT核心计算范围最小值查询（RMQ）的新方法。主要贡献在于提出了RMQ的几何解决方案：将数组元素转化为三角形，其形状和位置分别由元素的值和数组中的索引决定，使得根据查询参数发出的射线首次命中结果即对应查询结果。实验结果表明，RTXRMQ当前最适合处理相对于问题规模较小的查询范围，相较于最先进的CPU方法（HRMQ）和GPU方法（LCA）分别实现了高达5倍和2.3倍的加速。尽管对于中大型查询范围，RTXRMQ目前性能不及LCA，但其仍具有竞争力——相较于高度并行的CPU方法HRMQ，其速度仍快2.5倍至4倍。此外，跨最新RTX GPU架构的性能扩展实验表明，若当前RT性能扩展趋势持续，RTXRMQ的性能扩展速率将高于HRMQ和LCA，这使得该方法对未来采用批量RMQ的高性能应用更具相关性。

0

相关内容

Ray

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

专知

18+阅读 · 2018年9月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

山豆根中具有抗肿瘤活性的Cytisine-Pterocarpan型新骨架化合物的发现及其仿生合成研究

国家自然科学基金

0+阅读 · 2015年12月31日

S3AGA样本（Spitzer-SDSS Spectral Atlas of Galaxies and AGNs)及其AGN研究

国家自然科学基金

0+阅读 · 2014年12月31日

氮化硅纳米线的可控掺杂、能带结构及其光致发光性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

磁性介孔固相萃取剂的快速合成及其萃取性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

外包式纤维编织网增强ECC-RC组合结构受剪性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

瓜环类分子容器对IBX氧化反应的选择性催化研究

国家自然科学基金

0+阅读 · 2011年12月31日

自成膜小分子空穴传输化合物的合成与性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

硅光子学集成用Er silicate光波导放大器应用基础研究

国家自然科学基金

0+阅读 · 2009年12月31日

海洋天然产物Eudistomin衍生物的设计、合成及抗乙肝病毒构效关系研究

国家自然科学基金

0+阅读 · 2009年12月31日

Almost perfect nonlinear power functions with exponents expressed as fractions

Arxiv

0+阅读 · 2023年7月28日

Generalising the Fast Reciprocal Square Root Algorithm

Arxiv

0+阅读 · 2023年7月28日

Minimal error momentum Bregman-Kaczmarz

Arxiv

0+阅读 · 2023年7月28日

Accelerating Polynomial Modular Multiplication with Crossbar-Based Compute-in-Memory

Arxiv

0+阅读 · 2023年7月27日

Costate Convergence with Legendre-Lobatto Collocation for Trajectory Optimization

Arxiv

0+阅读 · 2023年7月26日

Sources of Opacity in Computer Systems: Towards a Comprehensive Taxonomy

Arxiv

0+阅读 · 2023年7月26日

Comprehensive Survey of Ternary Full Adders: Statistics, Corrections, and Assessments

Arxiv

0+阅读 · 2023年7月26日

The Information Bottleneck's Ordinary Differential Equation: First-Order Root-Tracking for the IB

Arxiv

0+阅读 · 2023年7月25日

AI Accelerator Survey and Trends

Arxiv

28+阅读 · 2021年9月18日

A Survey of Deep Learning for Scientific Discovery

A Survey of Deep Learning for Scientific Discovery

Arxiv

29+阅读 · 2020年3月26日

VIP会员

文章信息

相关主题

最新内容

ICML 2026 | 多任务贝叶斯上下文学习：让 Transformer 在测试时显式适应新先验

ICML 2026 | 多任务贝叶斯上下文学习：让 Transformer 在测试时显式适应新先验

专知会员服务

2+阅读 · 6月19日

ACL 2026综述 | 大规模手语数据集：资源、基准与标注标准

ACL 2026综述 | 大规模手语数据集：资源、基准与标注标准

专知会员服务

4+阅读 · 6月19日

ICML 2026 Spotlight | SmoothSMoE：解析稀疏 MoE 路由不连续

ICML 2026 Spotlight | SmoothSMoE：解析稀疏 MoE 路由不连续

专知会员服务

5+阅读 · 6月18日

综述 | 周期表视角下的大模型推理：范式、方法与失败模式

综述 | 周期表视角下的大模型推理：范式、方法与失败模式

专知会员服务

6+阅读 · 6月18日

《廉价自杀式无人机战争的军事战略影响：乌克兰和伊朗案例研究》

《廉价自杀式无人机战争的军事战略影响：乌克兰和伊朗案例研究》

专知会员服务

11+阅读 · 6月18日

《面向反无人机作战的联邦式可解释射频–光电/红外情报融合：边缘人工智能优化、电子战韧性及分布式监视验证》

《面向反无人机作战的联邦式可解释射频–光电/红外情报融合：边缘人工智能优化、电子战韧性及分布式监视验证》

专知会员服务

10+阅读 · 6月18日

ICML 2026 | FR3D：解耦自车运动的未来动态三维重建世界模型

ICML 2026 | FR3D：解耦自车运动的未来动态三维重建世界模型

专知会员服务

6+阅读 · 6月17日

【伯克利博士论文】迈向可扩展与自我演进的大语言模型智能体

【伯克利博士论文】迈向可扩展与自我演进的大语言模型智能体

专知会员服务

9+阅读 · 6月17日

学习数据的几何：形状空间分析数学综述

学习数据的几何：形状空间分析数学综述

专知会员服务

7+阅读 · 6月17日

《现代防空系统综述：架构、传感器、拦截器及新兴威胁环境对基础设施受限防御环境的影响》2026最新长综述

《现代防空系统综述：架构、传感器、拦截器及新兴威胁环境对基础设施受限防御环境的影响》2026最新长综述

专知会员服务

13+阅读 · 6月17日

定向能反无人机系统最新发展动态

定向能反无人机系统最新发展动态

专知会员服务

8+阅读 · 6月17日

从燃煤战舰到算法战争：水面指挥的永恒要求

从燃煤战舰到算法战争：水面指挥的永恒要求

专知会员服务

6+阅读 · 6月17日

《短程弹道再入飞行器拦截时间中的一项异常现象》

《短程弹道再入飞行器拦截时间中的一项异常现象》

专知会员服务

8+阅读 · 6月17日

《基于回归方法与任务上下文的对抗环境动态战术网络报文优先级排序》

《基于回归方法与任务上下文的对抗环境动态战术网络报文优先级排序》

专知会员服务

8+阅读 · 6月17日

美智库《战术级指挥控制的迫切要求：构建弹性机动式指挥控制网络》报告

美智库《战术级指挥控制的迫切要求：构建弹性机动式指挥控制网络》报告

专知会员服务

10+阅读 · 6月17日

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

ACL 2026综述 | 大规模手语数据集：资源、基准与标注标准

综述 | 周期表视角下的大模型推理：范式、方法与失败模式

ICML 2026 | 多任务贝叶斯上下文学习：让 Transformer 在测试时显式适应新先验

ICML 2026 Spotlight | SmoothSMoE：解析稀疏 MoE 路由不连续

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

专知

18+阅读 · 2018年9月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

相关论文

Almost perfect nonlinear power functions with exponents expressed as fractions

Arxiv

0+阅读 · 2023年7月28日

Generalising the Fast Reciprocal Square Root Algorithm

Arxiv

0+阅读 · 2023年7月28日

Minimal error momentum Bregman-Kaczmarz

Arxiv

0+阅读 · 2023年7月28日

Accelerating Polynomial Modular Multiplication with Crossbar-Based Compute-in-Memory

Arxiv

0+阅读 · 2023年7月27日

Costate Convergence with Legendre-Lobatto Collocation for Trajectory Optimization

Arxiv

0+阅读 · 2023年7月26日

Sources of Opacity in Computer Systems: Towards a Comprehensive Taxonomy

Arxiv

0+阅读 · 2023年7月26日

Comprehensive Survey of Ternary Full Adders: Statistics, Corrections, and Assessments

Arxiv

0+阅读 · 2023年7月26日

The Information Bottleneck's Ordinary Differential Equation: First-Order Root-Tracking for the IB

Arxiv

0+阅读 · 2023年7月25日

AI Accelerator Survey and Trends

Arxiv

28+阅读 · 2021年9月18日

A Survey of Deep Learning for Scientific Discovery

A Survey of Deep Learning for Scientific Discovery

Arxiv

29+阅读 · 2020年3月26日

相关基金

山豆根中具有抗肿瘤活性的Cytisine-Pterocarpan型新骨架化合物的发现及其仿生合成研究

国家自然科学基金

0+阅读 · 2015年12月31日

S3AGA样本（Spitzer-SDSS Spectral Atlas of Galaxies and AGNs)及其AGN研究

国家自然科学基金

0+阅读 · 2014年12月31日

氮化硅纳米线的可控掺杂、能带结构及其光致发光性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

磁性介孔固相萃取剂的快速合成及其萃取性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

外包式纤维编织网增强ECC-RC组合结构受剪性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

瓜环类分子容器对IBX氧化反应的选择性催化研究

国家自然科学基金

0+阅读 · 2011年12月31日

自成膜小分子空穴传输化合物的合成与性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

硅光子学集成用Er silicate光波导放大器应用基础研究

国家自然科学基金

0+阅读 · 2009年12月31日

海洋天然产物Eudistomin衍生物的设计、合成及抗乙肝病毒构效关系研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员