With the advent of group equivariant convolutions in deep networks literature, spherical CNNs with $\mathsf{SO}(3)$-equivariant layers have been developed to cope with data that are samples of signals on the sphere $S^2$. One can implicitly obtain $\mathsf{SO}(3)$-equivariant convolutions on $S^2$ with significant efficiency gains by explicitly requiring gauge equivariance w.r.t. $\mathsf{SO}(2)$. In this paper, we build on this fact by introducing a higher order generalization of the gauge equivariant convolution, whose implementation is dubbed a gauge equivariant Volterra network (GEVNet). This allows us to model spatially extended nonlinear interactions within a given receptive field while still maintaining equivariance to global isometries. We prove theoretical results regarding the equivariance and construction of higher order gauge equivariant convolutions. Then, we empirically demonstrate the parameter efficiency of our model, first on computer vision benchmark data (e.g. spherical MNIST), and then in combination with a convolutional kernel network (CKN) on neuroimaging data. In the neuroimaging data experiments, the resulting two-part architecture (CKN + GEVNet) is used to automatically discriminate between patients with Lewy Body Disease (DLB), Alzheimer's Disease (AD) and Parkinson's Disease (PD) from diffusion magnetic resonance images (dMRI). The GEVNet extracts micro-architectural features within each voxel, while the CKN extracts macro-architectural features across voxels. This compound architecture is uniquely poised to exploit the intra- and inter-voxel information contained in the dMRI data, leading to improved performance over the classification results obtained from either of the individual components.
翻译:随着深度网络文献中群等变卷积的出现,具有$\mathsf{SO}(3)$等变层的球面卷积网络被开发出来,以处理球面$S^2$上信号的采样数据。通过显式要求相对于$\mathsf{SO}(2)$的规范等变性,可以隐式地在$S^2$上获得具有显著效率提升的$\mathsf{SO}(3)$等变卷积。本文基于这一事实,引入了规范等变卷积的高阶推广,其实现被称为规范等变沃尔泰拉网络(GEVNet)。这使得我们能够在给定感受野内建模空间扩展的非线性交互,同时保持对全局等距变换的等变性。我们证明了关于高阶规范等变卷积的等变性和构造的理论结果。然后,我们通过实验展示了模型在参数效率上的优势,首先在计算机视觉基准数据(如球面MNIST)上,然后与卷积核网络(CKN)结合应用于神经影像数据。在神经影像数据实验中,所得到的两部分架构(CKN + GEVNet)被用于从弥散磁共振图像(dMRI)中自动区分路易体痴呆(DLB)、阿尔茨海默病(AD)和帕金森病(PD)患者。GEVNet提取每个体素内的微结构特征,而CKN提取体素间的宏观结构特征。这种复合架构独特地利用dMRI数据中包含的体素内和体素间信息,从而在分类性能上优于单独使用任一组件所获得的结果。