Masked autoencoder has demonstrated its effectiveness in self-supervised point cloud learning. Considering that masking is a kind of corruption, in this work we explore a more general denoising autoencoder for point cloud learning (Point-DAE) by investigating more types of corruptions beyond masking. Specifically, we degrade the point cloud with certain corruptions as input, and learn an encoder-decoder model to reconstruct the original point cloud from its corrupted version. Three corruption families (\ie, density/masking, noise, and affine transformation) and a total of fourteen corruption types are investigated with traditional non-Transformer encoders. Besides the popular masking corruption, we identify another effective corruption family, \ie, affine transformation. The affine transformation disturbs all points globally, which is complementary to the masking corruption where some local regions are dropped. We also validate the effectiveness of affine transformation corruption with the Transformer backbones, where we decompose the reconstruction of the complete point cloud into the reconstructions of detailed local patches and rough global shape, alleviating the position leakage problem in the reconstruction. Extensive experiments on tasks of object classification, few-shot learning, robustness testing, part segmentation, and 3D object detection validate the effectiveness of the proposed method. The codes are available at \url{https://github.com/YBZh/Point-DAE}.
翻译:摘要:掩码自编码器已在自监督点云学习中展现出有效性。鉴于掩码是一种噪声损坏形式,本研究探索了更通用的点云去噪自编码器(Point-DAE),通过研究掩码之外更多类型的噪声损坏。具体而言,我们将点云退化至特定噪声损坏作为输入,学习一个编码器-解码器模型以从损坏版本重建原始点云。我们使用传统非Transformer编码器研究了三类损坏(即密度/掩码、噪声和仿射变换)共十四种具体类型。除广泛使用的掩码损坏外,我们发现仿射变换是另一类有效的损坏方式——该变换全局扰动所有点,与局部区域丢弃的掩码损坏形成互补。我们还在Transformer骨干网络中验证了仿射变换损坏的有效性,将完整点云的重建分解为局部细节补丁与全局粗略形状的重建,缓解了重建中的位置泄露问题。在目标分类、小样本学习、鲁棒性测试、部件分割与3D目标检测任务上的大量实验验证了所提方法的有效性。代码见\url{https://github.com/YBZh/Point-DAE}。