Masked autoencoder has demonstrated its effectiveness in self-supervised point cloud learning. Considering that masking is a kind of corruption, in this work we explore a more general denoising autoencoder for point cloud learning (Point-DAE) by investigating more types of corruptions beyond masking. Specifically, we degrade the point cloud with certain corruptions as input, and learn an encoder-decoder model to reconstruct the original point cloud from its corrupted version. Three corruption families (\ie, density/masking, noise, and affine transformation) and a total of fourteen corruption types are investigated with traditional non-Transformer encoders. Besides the popular masking corruption, we identify another effective corruption family, \ie, affine transformation. The affine transformation disturbs all points globally, which is complementary to the masking corruption where some local regions are dropped. We also validate the effectiveness of affine transformation corruption with the Transformer backbones, where we decompose the reconstruction of the complete point cloud into the reconstructions of detailed local patches and rough global shape, alleviating the position leakage problem in the reconstruction. Extensive experiments on tasks of object classification, few-shot learning, robustness testing, part segmentation, and 3D object detection validate the effectiveness of the proposed method. The codes are available at \url{https://github.com/YBZh/Point-DAE}.
翻译:掩码自编码器已在自监督点云学习中证明了其有效性。考虑到掩码是一种损坏形式,本工作通过探索掩码之外的更多损坏类型,研究了一种更通用的用于点云学习的去噪自编码器(Point-DAE)。具体而言,我们使用特定损坏方式对点云进行降质作为输入,并学习一个编码器-解码器模型以从其损坏版本重建原始点云。本研究在传统非Transformer编码器上探索了三种损坏家族(即密度/掩码、噪声和仿射变换)共计十四种损坏类型。除了流行的掩码损坏外,我们识别出另一个有效的损坏家族,即仿射变换。仿射变换全局扰动所有点,这与掩码损坏(某些局部区域被丢弃)形成互补。我们还通过Transformer骨干网络验证了仿射变换损坏的有效性,其中将完整点云的重建分解为详细局部块和粗略全局形状的重建,缓解了重建中的位置泄漏问题。在物体分类、少样本学习、鲁棒性测试、部件分割和3D物体检测任务上的大量实验验证了所提方法的有效性。代码发布于\url{https://github.com/YBZh/Point-DAE}。