Employing equivariance in neural networks leads to greater parameter efficiency and improved generalization performance through the encoding of domain knowledge in the architecture; however, the majority of existing approaches require an a priori specification of the desired symmetries. We present a neural network architecture, Linear Group Networks (LGNs), for learning linear groups acting on the weight space of neural networks. Linear groups are desirable due to their inherent interpretability, as they can be represented as finite matrices. LGNs learn groups without any supervision or knowledge of the hidden symmetries in the data and the groups can be mapped to well known operations in machine learning. We use LGNs to learn groups on multiple datasets while considering different downstream tasks; we demonstrate that the linear group structure depends on both the data distribution and the considered task.
翻译:在神经网络中运用等变性能够通过将领域知识编码到架构中,从而提高参数效率并改善泛化性能;然而,现有的大多数方法需要预先指定所需对称性。我们提出一种神经网络架构——线性群网络(Linear Group Networks, LGNs),用于学习作用于神经网络权重空间的线性群。线性群因其可表示为有限矩阵而具备固有的可解释性,这一特性使其备受青睐。LGNs无需任何监督信号或对数据中隐藏对称性的先验知识即可学习群结构,并且所学的群可映射至机器学习中的经典操作。我们利用LGNs在多个数据集上学习群结构,同时考虑不同的下游任务;实验证明,线性群结构同时取决于数据分布特征与目标任务类型。