The recent trend in deep learning methods for 3D point cloud understanding is to propose increasingly sophisticated architectures either to better capture 3D geometries or by introducing possibly undesired inductive biases. Moreover, prior works introducing novel architectures compared their performance on the same domain, devoting less attention to their generalization to other domains. We argue that the ability of a model to transfer the learnt knowledge to different domains is an important feature that should be evaluated to exhaustively assess the quality of a deep network architecture. In this work we propose PatchMixer, a simple yet effective architecture that extends the ideas behind the recent MLP-Mixer paper to 3D point clouds. The novelties of our approach are the processing of local patches instead of the whole shape to promote robustness to partial point clouds, and the aggregation of patch-wise features using an MLP as a simpler alternative to the graph convolutions or the attention mechanisms that are used in prior works. We evaluated our method on the shape classification and part segmentation tasks, achieving superior generalization performance compared to a selection of the most relevant deep architectures.
翻译:近期3D点云理解深度学习方法的发展趋势是提出日益复杂的架构,或旨在更好捕捉3D几何特性,或引入可能非必要的归纳偏置。此外,先前提出新型架构的研究往往在相同领域内比较性能,较少关注其向其他领域的泛化能力。我们认为,模型将学习知识迁移至不同领域的能力是评估深度网络架构质量的重要指标。本文提出PatchMixer,一种简洁而有效的架构,将近期MLP-Mixer的思想拓展至3D点云。该方法的核心创新在于:处理局部块而非整体形状以增强对不完整点云的鲁棒性,并使用MLP聚合逐块特征,作为先前工作中图卷积或注意力机制的简化替代方案。在形状分类与部件分割任务上的评估结果表明,相比最具代表性的深度架构,本方法实现了更优的泛化性能。