Despite the remarkable success of deep learning, an optimal convolution operation on point clouds remains elusive owing to their irregular data structure. Existing methods mainly focus on designing an effective continuous kernel function that can handle an arbitrary point in continuous space. Various approaches exhibiting high performance have been proposed, but we observe that the standard pointwise feature is represented by 1D channels and can become more informative when its representation involves additional spatial feature dimensions. In this paper, we present Multidimensional Kernel Convolution (MKConv), a novel convolution operator that learns to transform the point feature representation from a vector to a multidimensional matrix. Unlike standard point convolution, MKConv proceeds via two steps. (i) It first activates the spatial dimensions of local feature representation by exploiting multidimensional kernel weights. These spatially expanded features can represent their embedded information through spatial correlation as well as channel correlation in feature space, carrying more detailed local structure information. (ii) Then, discrete convolutions are applied to the multidimensional features which can be regarded as a grid-structured matrix. In this way, we can utilize the discrete convolutions for point cloud data without voxelization that suffers from information loss. Furthermore, we propose a spatial attention module, Multidimensional Local Attention (MLA), to provide comprehensive structure awareness within the local point set by reweighting the spatial feature dimensions. We demonstrate that MKConv has excellent applicability to point cloud processing tasks including object classification, object part segmentation, and scene semantic segmentation with superior results.
翻译:尽管深度学习已取得显著成功,但由于点云数据的不规则结构,其最优卷积运算仍难以实现。现有方法主要致力于设计能够处理连续空间中任意点的有效连续核函数。尽管已有多种高性能方法被提出,但我们观察到标准逐点特征以1D通道表示,当其表示涉及额外空间特征维度时,可具有更丰富的信息。本文提出多维核卷积(MKConv),一种新颖的卷积算子,通过学习将点特征表示从向量转换为多维矩阵。与标准点卷积不同,MKConv通过两步实现:(i)首先利用多维核权重激活局部特征表示的空间维度,这些空间扩展特征可通过特征空间中的空间相关性及通道相关性表示嵌入信息,携带更详细的局部结构信息;(ii)随后对可视为网格结构矩阵的多维特征施加离散卷积。通过这种方式,我们能够在不经历信息损失的体素化前提下,对点云数据应用离散卷积。此外,我们提出空间注意力模块——多维局部注意力(MLA),通过重加权空间特征维度,在局部点集内提供全面的结构感知能力。实验证明,MKConv在包括目标分类、目标部件分割及场景语义分割的点云处理任务中具有卓越适用性,并取得了优越结果。