Many datasets in scientific and engineering applications are comprised of objects which have specific geometric structure. A common example is data which inhabits a representation of the group SO$(3)$ of 3D rotations: scalars, vectors, tensors, \textit{etc}. One way for a neural network to exploit prior knowledge of this structure is to enforce SO$(3)$-equivariance throughout its layers, and several such architectures have been proposed. While general methods for handling arbitrary SO$(3)$ representations exist, they computationally intensive and complicated to implement. We show that by judicious symmetry breaking, we can efficiently increase the expressiveness of a network operating only on vector and order-2 tensor representations of SO$(2)$. We demonstrate the method on an important problem from High Energy Physics known as \textit{b-tagging}, where particle jets originating from b-meson decays must be discriminated from an overwhelming QCD background. In this task, we find that augmenting a standard architecture with our method results in a \ensuremath{2.3\times} improvement in rejection score.
翻译:许多科学与工程应用中的数据集由具有特定几何结构的对象组成。一个常见示例是存在于三维旋转群SO(3)表示空间中的数据,例如标量、向量、张量等。神经网络利用这种结构先验知识的一种方法,是在其各层中强制实现SO(3)等变性,目前已有若干此类架构被提出。尽管存在处理任意SO(3)表示的通用方法,但它们在计算上代价高昂且实现复杂。我们证明,通过审慎的对称破缺,可以高效提升仅基于SO(2)的向量及二阶张量表示运行的网络的表达能力。我们将该方法应用于高能物理中一个被称为"b-tagging"的重要问题——在该任务中,需将源自b介子衰变的粒子喷注与占主导地位的QCD本底区分开来。实验表明,将标准架构与我们的方法相结合后,其排斥评分提升了2.3倍。