LLM-PCGC: Large Language Model-based Point Cloud Geometry Compression

The key to effective point cloud compression is to obtain a robust context model consistent with complex 3D data structures. Recently, the advancement of large language models (LLMs) has highlighted their capabilities not only as powerful generators for in-context learning and generation but also as effective compressors. These dual attributes of LLMs make them particularly well-suited to meet the demands of data compression. Therefore, this paper explores the potential of using LLM for compression tasks, focusing on lossless point cloud geometry compression (PCGC) experiments. However, applying LLM directly to PCGC tasks presents some significant challenges, i.e., LLM does not understand the structure of the point cloud well, and it is a difficult task to fill the gap between text and point cloud through text description, especially for large complicated and small shapeless point clouds. To address these problems, we introduce a novel architecture, namely the Large Language Model-based Point Cloud Geometry Compression (LLM-PCGC) method, using LLM to compress point cloud geometry information without any text description or aligning operation. By utilizing different adaptation techniques for cross-modality representation alignment and semantic consistency, including clustering, K-tree, token mapping invariance, and Low Rank Adaptation (LoRA), the proposed method can translate LLM to a compressor/generator for point cloud. To the best of our knowledge, this is the first structure to employ LLM as a compressor for point cloud data. Experiments demonstrate that the LLM-PCGC outperforms the other existing methods significantly, by achieving -40.213% bit rate reduction compared to the reference software of MPEG Geometry-based Point Cloud Compression (G-PCC) standard, and by achieving -2.267% bit rate reduction compared to the state-of-the-art learning-based method.

翻译：高效点云压缩的关键在于获取与复杂三维数据结构相一致的鲁棒上下文模型。近年来，大语言模型（LLMs）的发展不仅凸显了其作为上下文学习与生成的强大生成器能力，也展现了其作为高效压缩器的潜力。LLMs 的这种双重特性使其特别适合满足数据压缩的需求。因此，本文探索了利用 LLM 进行压缩任务的潜力，重点关注无损点云几何压缩（PCGC）实验。然而，直接将 LLM 应用于 PCGC 任务存在一些显著挑战，即 LLM 不能很好地理解点云的结构，且通过文本描述来弥合文本与点云之间的鸿沟是一项困难的任务，特别是对于大型复杂且形状不规则的小点云。为解决这些问题，我们提出了一种新颖的架构，即基于大语言模型的点云几何压缩（LLM-PCGC）方法，该方法利用 LLM 压缩点云几何信息，无需任何文本描述或对齐操作。通过采用包括聚类、K树、令牌映射不变性和低秩自适应（LoRA）在内的多种跨模态表示对齐与语义一致性适应技术，所提方法能够将 LLM 转化为点云的压缩器/生成器。据我们所知，这是首个将 LLM 用作点云数据压缩器的结构。实验表明，LLM-PCGC 显著优于其他现有方法，与 MPEG 基于几何的点云压缩（G-PCC）标准的参考软件相比，实现了 -40.213% 的比特率降低；与最先进的基于学习的方法相比，实现了 -2.267% 的比特率降低。

相关内容

点云

关注 50

根据激光测量原理得到的点云，包括三维坐标（XYZ）和激光反射强度（Intensity）。根据摄影测量原理得到的点云，包括三维坐标（XYZ）和颜色信息（RGB）。结合激光测量和摄影测量原理得到点云，包括三维坐标（XYZ）、激光反射强度（Intensity）和颜色信息（RGB）。在获取物体表面每个采样点的空间坐标后，得到的是一个点的集合，称之为“点云”(Point Cloud)

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日