Canonical Polyadic (CP) tensor decomposition is a workhorse algorithm for discovering underlying low-dimensional structure in tensor data. This is accomplished in conventional CP decomposition by fitting a low-rank tensor to data with respect to the least-squares loss. Generalized CP (GCP) decompositions generalize this approach by allowing general loss functions that can be more appropriate, e.g., to model binary and count data or to improve robustness to outliers. However, GCP decompositions do not explicitly account for any symmetry in the tensors, which commonly arises in modern applications. For example, a tensor formed by stacking the adjacency matrices of a dynamic graph over time will naturally exhibit symmetry along the two modes corresponding to the graph nodes. In this paper, we develop a symmetric GCP (SymGCP) decomposition that allows for general forms of symmetry, i.e., symmetry along any subset of the modes. SymGCP accounts for symmetry by enforcing the corresponding symmetry in the decomposition. We derive gradients for SymGCP that enable its efficient computation via all-at-once optimization with existing tensor kernels. The form of the gradients also leads to various stochastic approximations that enable us to develop stochastic SymGCP algorithms that can scale to large tensors. We demonstrate the utility of the proposed SymGCP algorithms with a variety of experiments on both synthetic and real data.
翻译:规范多元(CP)张量分解是揭示张量数据中潜在低维结构的基础算法。传统CP分解通过最小二乘损失函数将低秩张量拟合至数据实现这一目标。广义CP(GCP)分解通过允许使用更合适的广义损失函数(例如用于建模二元与计数数据,或提升对异常值的鲁棒性)对此方法进行了推广。然而,GCP分解未显式考虑张量中普遍存在于现代应用中的对称性。例如,通过堆叠动态图随时间变化的邻接矩阵形成的张量,自然会在对应于图节点的两个模态上呈现对称性。本文提出了一种对称GCP(SymGCP)分解方法,允许任意模态子集上的广义对称形式。SymGCP通过在分解中强制实施对应对称性来处理对称结构。我们推导了SymGCP的梯度公式,使其能够借助现有张量计算核通过整体优化实现高效计算。该梯度形式还衍生出多种随机近似方法,使我们能够开发可扩展至大规模张量的随机SymGCP算法。我们通过合成数据与真实数据的多组实验,验证了所提SymGCP算法的实用性。