The symmetric generalized eigenvalue problem (SGEP) is a fundamental concept in numerical linear algebra. It captures the solution of many classical machine learning problems such as canonical correlation analysis, independent components analysis, partial least squares, linear discriminant analysis, principal components and others. Despite this, most general solvers are prohibitively expensive when dealing with streaming data sets (i.e., minibatches) and research has instead concentrated on finding efficient solutions to specific problem instances. In this work, we develop a game-theoretic formulation of the top-$k$ SGEP whose Nash equilibrium is the set of generalized eigenvectors. We also present a parallelizable algorithm with guaranteed asymptotic convergence to the Nash. Current state-of-the-art methods require $O(d^2k)$ runtime complexity per iteration which is prohibitively expensive when the number of dimensions ($d$) is large. We show how to modify this parallel approach to achieve $O(dk)$ runtime complexity. Empirically we demonstrate that this resulting algorithm is able to solve a variety of SGEP problem instances including a large-scale analysis of neural network activations.
翻译:对称广义特征值问题(SGEP)是数值线性代数中的基本概念。它捕捉了许多经典机器学习问题的解,如典范相关分析、独立成分分析、偏最小二乘法、线性判别分析、主成分分析等。尽管如此,多数通用求解器在处理流式数据集(即小批量数据)时计算代价过高,而相关研究转而聚焦于为特定问题实例寻找高效解决方案。在本工作中,我们提出了top-$k$ SGEP的博弈论形式化描述,其纳什均衡即为广义特征向量集。我们还提出了一种可并行化算法,并保证其渐近收敛至该纳什均衡。当前最先进的方法每轮迭代需要$O(d^2k)$时间复杂度,当维度($d$)较大时计算代价过高。我们展示了如何改进这种并行方法以实现$O(dk)$时间复杂度。实验表明,该算法能有效求解多种SGEP问题实例,包括大规模神经网络激活分析。