面向嵌入式GPU的基于膨胀的ICP算法的动态内存分配策略 (A dynamic memory assignment strategy for dilation-based ICP algorithm on embedded GPUs)

This paper proposes a memory-efficient optimization strategy for the high-performance point cloud registration algorithm VANICP, enabling lightweight execution on embedded GPUs with constrained hardware resources. VANICP is a recently published acceleration framework that significantly improves the computational efficiency of point-cloud-based applications. By transforming the global nearest neighbor search into a localized process through a dilation-based information propagation mechanism, VANICP greatly reduces the computational complexity of the NNS. However, its original implementation demands a considerable amount of memory, which restricts its deployment in resource-constrained environments such as embedded systems. To address this issue, we propose a GPU-oriented dynamic memory assignment strategy that optimizes the memory usage of the dilation operation. Furthermore, based on this strategy, we construct an enhanced version of the VANICP framework that achieves over 97% reduction in memory consumption while preserving the original performance. Source code is published on: https://github.com/changqiong/VANICP4Em.git.

翻译：本文提出了一种针对高性能点云配准算法VANICP的内存高效优化策略，使其能够在硬件资源受限的嵌入式GPU上实现轻量化执行。VANICP是近期发布的一种加速框架，显著提升了基于点云应用的计算效率。通过基于膨胀的信息传播机制将全局最近邻搜索转化为局部化过程，VANICP极大地降低了最近邻搜索的计算复杂度。然而，其原始实现需要大量内存，这限制了其在嵌入式系统等资源受限环境中的部署。为解决此问题，我们提出了一种面向GPU的动态内存分配策略，以优化膨胀操作的内存使用。此外，基于该策略，我们构建了VANICP框架的增强版本，在保持原有性能的同时，实现了超过97%的内存消耗降低。源代码发布于：https://github.com/changqiong/VANICP4Em.git。

相关内容