Self-supervised methods have been proven effective for learning deep representations of 3D point cloud data. Although recent methods in this domain often rely on random masking of inputs, the results of this approach can be improved. We introduce PointCAM, a novel adversarial method for learning a masking function for point clouds. Our model utilizes a self-distillation framework with an online tokenizer for 3D point clouds. Compared to previous techniques that optimize patch-level and object-level objectives, we postulate applying an auxiliary network that learns how to select masks instead of choosing them randomly. Our results show that the learned masking function achieves state-of-the-art or competitive performance on various downstream tasks. The source code is available at https://github.com/szacho/pointcam.
翻译:自监督方法已被证明能有效学习三维点云数据的深层表示。尽管近年来该领域的方法常依赖随机掩码输入,但该方式的结果仍有改进空间。我们提出PointCAM,一种新颖的对抗式方法,用于学习点云的掩码函数。该模型采用自蒸馏框架,并集成面向三维点云的在线分词器。与以往优化补丁级和物体级目标的技术不同,我们主张引入辅助网络来学习如何选择掩码,而非随机选取。实验结果表明,所学掩码函数在多种下游任务上取得了最优或具有竞争力的性能。源代码发布于https://github.com/szacho/pointcam。