This paper introduces cuVegas, a CUDA-based implementation of the Vegas Enhanced Algorithm (VEGAS+), optimized for multi-dimensional integration in GPU environments. The VEGAS+ algorithm is an advanced form of Monte Carlo integration, recognized for its adaptability and effectiveness in handling complex, high-dimensional integrands. It employs a combination of variance reduction techniques, namely adaptive importance sampling and a variant of adaptive stratified sampling, that make it particularly adept at managing integrands with multiple peaks or those aligned with the diagonals of the integration volume. Being a Monte Carlo integration method, the task is well suited for parallelization and for GPU execution. Our implementation, cuVegas, aims to harness the inherent parallelism of GPUs, addressing the challenge of workload distribution that often hampers efficiency in standard implementations. We present a comprehensive analysis comparing cuVegas with existing CPU and GPU implementations, demonstrating significant performance improvements, from two to three orders of magnitude on CPUs, and from a factor of two on GPUs over the best existing implementation. We also demonstrate the speedup for integrands for which VEGAS+ was designed, with multiple peaks or other significant structures aligned with diagonals of the integration volume.
翻译:本文介绍了cuVegas,一种基于CUDA的VEGAS增强算法(VEGAS+)实现,专为GPU环境下的多维积分计算优化。VEGAS+算法是蒙特卡洛积分的一种高级形式,以其在处理复杂高维被积函数时的适应性和高效性而著称。它结合了两种方差缩减技术——自适应重要性采样和自适应分层采样的变体——使其特别擅长处理具有多个峰值或沿积分体积对角线分布的被积函数。作为一种蒙特卡洛积分方法,该任务天然适合并行化及GPU执行。我们的实现cuVegas旨在利用GPU的固有并行性,解决标准实现中常因负载分配问题而影响效率的挑战。我们提供了cuVegas与现有CPU及GPU实现的全面对比分析,结果表明其性能显著提升:相较于CPU实现提升两到三个数量级,相较于现有最佳GPU实现提升约两倍。我们还展示了cuVegas针对VEGAS+算法所设计的被积函数(如具有多个峰值或沿积分体积对角线存在其他显著结构)的加速效果。