Large matrices arise in many machine learning and data analysis applications, including as representations of datasets, graphs, model weights, and first and second-order derivatives. Randomized Numerical Linear Algebra (RandNLA) is an area which uses randomness to develop improved algorithms for ubiquitous matrix problems. The area has reached a certain level of maturity; but recent hardware trends, efforts to incorporate RandNLA algorithms into core numerical libraries, and advances in machine learning, statistics, and random matrix theory, have lead to new theoretical and practical challenges. This article provides a self-contained overview of RandNLA, in light of these developments.
翻译:大型矩阵广泛存在于机器学习与数据分析的诸多应用中,例如作为数据集、图结构、模型权重以及一阶与二阶导数的表示形式。随机数值线性代数(RandNLA)是一个利用随机性为普遍存在的矩阵问题设计改进算法的研究领域。该领域已发展至相对成熟的阶段;然而,近期的硬件发展趋势、将RandNLA算法整合至核心数值库的努力,以及机器学习、统计学与随机矩阵理论的进步,共同带来了新的理论与实践挑战。本文结合这些发展,对RandNLA进行了自成体系的概述。