Deep learning is increasingly being used to perform machine vision tasks such as classification, object detection, and segmentation on 3D point cloud data. However, deep learning inference is computationally expensive. The limited computational capabilities of end devices thus necessitate a codec for transmitting point cloud data over the network for server-side processing. Such a codec must be lightweight and capable of achieving high compression ratios without sacrificing accuracy. Motivated by this, we present a novel point cloud codec that is highly specialized for the machine task of classification. Our codec, based on PointNet, achieves a significantly better rate-accuracy trade-off in comparison to alternative methods. In particular, it achieves a 94% reduction in BD-bitrate over non-specialized codecs on the ModelNet40 dataset. For low-resource end devices, we also propose two lightweight configurations of our encoder that achieve similar BD-bitrate reductions of 93% and 92% with 3% and 5% drops in top-1 accuracy, while consuming only 0.470 and 0.048 encoder-side kMACs/point, respectively. Our codec demonstrates the potential of specialized codecs for machine analysis of point clouds, and provides a basis for extension to more complex tasks and datasets in the future.
翻译:深度学习越来越多地被用于对三维点云数据执行分类、目标检测和分割等机器视觉任务。然而,深度学习推理的计算成本高昂。终端设备有限的计算能力因此需要通过编解码器将点云数据通过网络传输至服务器端进行处理。此类编解码器必须轻量化,并能够在保证高压缩比的同时不损失精度。受此启发,我们提出了一种专为分类机器任务高度优化的新型点云编解码器。该编解码器基于PointNet,相比其他方法实现了显著更优的率-精度权衡。具体而言,在ModelNet40数据集上,相比非专用编解码器,它实现了94%的BD码率降低。针对低资源终端设备,我们还提出了编码器的两种轻量化配置,分别实现了93%和92%的BD码率降低,同时top-1精度下降了3%和5%,而编码侧每点仅消耗0.470和0.048千次乘加运算(kMACs)。我们的编解码器展示了专用编解码器在点云机器分析领域的潜力,并为未来向更复杂任务和数据集的扩展奠定了基础。