In many practical applications, 3D point cloud analysis requires rotation invariance. In this paper, we present a learnable descriptor invariant under 3D rotations and reflections, i.e., the O(3) actions, utilizing the recently introduced steerable 3D spherical neurons and vector neurons. Specifically, we propose an embedding of the 3D spherical neurons into 4D vector neurons, which leverages end-to-end training of the model. In our approach, we perform TetraTransform--an equivariant embedding of the 3D input into 4D, constructed from the steerable neurons--and extract deeper O(3)-equivariant features using vector neurons. This integration of the TetraTransform into the VN-DGCNN framework, termed TetraSphere, negligibly increases the number of parameters by less than 0.0002%. TetraSphere sets a new state-of-the-art performance classifying randomly rotated real-world object scans of the challenging subsets of ScanObjectNN. Additionally, TetraSphere outperforms all equivariant methods on randomly rotated synthetic data: classifying objects from ModelNet40 and segmenting parts of the ShapeNet shapes. Thus, our results reveal the practical value of steerable 3D spherical neurons for learning in 3D Euclidean space.
翻译:在许多实际应用中,三维点云分析需要旋转不变性。本文利用最新引入的可操控三维球面神经元和向量神经元,提出了一种在三维旋转和反射(即O(3)作用)下不变的可学习描述符。具体而言,我们提出将三维球面神经元嵌入到四维向量神经元中,从而充分利用模型的端到端训练。在该方法中,我们执行TetraTransform——一种由可操控神经元构建的三维输入到四维的等变嵌入——并利用向量神经元提取更深层的O(3)-等变特征。将TetraTransform集成到VN-DGCNN框架中(称为TetraSphere),参数数量仅增加不到0.0002%。TetraSphere在随机旋转的真实物体扫描数据(ScanObjectNN中的挑战性子集)分类任务上达到了新的最优性能。此外,在随机旋转的合成数据上,TetraSphere优于所有等变方法:包括ModelNet40的物体分类和ShapeNet形状的部件分割。因此,我们的结果揭示了可操控三维球面神经元在三维欧几里得空间学习中的实用价值。