We tackle the data scarcity challenge in few-shot point cloud recognition of 3D objects by using a joint prediction from a conventional 3D model and a well-trained 2D model. Surprisingly, such an ensemble, though seems trivial, has hardly been shown effective in recent 2D-3D models. We find out the crux is the less effective training for the ''joint hard samples'', which have high confidence prediction on different wrong labels, implying that the 2D and 3D models do not collaborate well. To this end, our proposed invariant training strategy, called InvJoint, does not only emphasize the training more on the hard samples, but also seeks the invariance between the conflicting 2D and 3D ambiguous predictions. InvJoint can learn more collaborative 2D and 3D representations for better ensemble. Extensive experiments on 3D shape classification with widely adopted ModelNet10/40, ScanObjectNN and Toys4K, and shape retrieval with ShapeNet-Core validate the superiority of our InvJoint.
翻译:针对三维物体少样本点云识别中的数据稀缺问题,我们提出利用传统三维模型与经过充分训练的二维模型进行联合预测。令人惊讶的是,这种看似平凡的集成策略在近期二维-三维模型中尚未被证明有效。我们发现关键症结在于“联合难样本”的训练效果不足——这些样本在多个错误标签上均具有高置信度预测,表明二维与三维模型未能有效协作。为此,本文提出的不变性训练策略InvJoint,不仅强化对难样本的训练权重,更致力于在二维与三维模型相互矛盾的模糊预测之间寻求预测不变性。InvJoint能够学习更具协作性的二维-三维表征以提升集成效果。在广泛使用的ModelNet10/40、ScanObjectNN和Toys4K数据集上的三维形状分类实验,以及ShapeNet-Core数据集上的形状检索实验,均验证了InvJoint方法的优越性。