Randomized Numerical Linear Algebra : A Perspective on the Field With an Eye to Software

Riley Murray,James Demmel,Michael W. Mahoney,N. Benjamin Erichson,Maksim Melnichenko,Osman Asif Malik,Laura Grigori,Piotr Luszczek,Michał Dereziński,Miles E. Lopes,Tianyu Liang,Hengrui Luo,Jack Dongarra

from arxiv, v1: this is the first arXiv release of LAPACK Working Note 299. v2: complete rewrite of the subsection on trace estimation, among other changes. See frontmatter page ii (pdf page 5) for revision history

Randomized numerical linear algebra - RandNLA, for short - concerns the use of randomization as a resource to develop improved algorithms for large-scale linear algebra computations. The origins of contemporary RandNLA lay in theoretical computer science, where it blossomed from a simple idea: randomization provides an avenue for computing approximate solutions to linear algebra problems more efficiently than deterministic algorithms. This idea proved fruitful in the development of scalable algorithms for machine learning and statistical data analysis applications. However, RandNLA's true potential only came into focus upon integration with the fields of numerical analysis and "classical" numerical linear algebra. Through the efforts of many individuals, randomized algorithms have been developed that provide full control over the accuracy of their solutions and that can be every bit as reliable as algorithms that might be found in libraries such as LAPACK. Recent years have even seen the incorporation of certain RandNLA methods into MATLAB, the NAG Library, NVIDIA's cuSOLVER, and SciKit-Learn. For all its success, we believe that RandNLA has yet to realize its full potential. In particular, we believe the scientific community stands to benefit significantly from suitably defined "RandBLAS" and "RandLAPACK" libraries, to serve as standards conceptually analogous to BLAS and LAPACK. This 200-page monograph represents a step toward defining such standards. In it, we cover topics spanning basic sketching, least squares and optimization, low-rank approximation, full matrix decompositions, leverage score sampling, and sketching data with tensor product structures (among others). Much of the provided pseudo-code has been tested via publicly available MATLAB and Python implementations.

翻译：随机数值线性代数——简称RandNLA——关注如何利用随机化这一资源，为大规模线性代数计算开发改进算法。当代RandNLA的起源可追溯至理论计算机科学，在那里它从一个简单想法蓬勃发展：随机化为线性代数问题的近似解提供了比确定性算法更高效率的求解途径。这一想法在开发面向机器学习与统计数据分析应用的可扩展算法中已得到验证。然而，RandNLA的真正潜力只有在与数值分析及"经典"数值线性代数的领域融合后才得以显现。通过众多研究者的努力，现已开发出能够完全控制解精度、且与LAPACK等库中算法同样可靠的随机算法。近年来，某些RandNLA方法甚至已整合进MATLAB、NAG数值库、NVIDIA cuSOLVER及Scikit-learn中。尽管取得诸多成功，我们认为RandNLA仍有未竟的潜力。具体而言，我们相信科学界将极大受益于适当定义的"RandBLAS"和"RandLAPACK"库——它们可作为类比于BLAS和LAPACK的概念性标准。这部200页的专著正是迈向定义此类标准的一步。书中涵盖的主题包括：基础草图化、最小二乘与优化、低秩近似、全矩阵分解、杠杆值采样、以及基于张量积结构的数据草图化（此外还有其他内容）。提供的伪代码多数已通过公开可用的MATLAB与Python实现完成测试。

相关内容

线性代数

关注 41

线性代数（Linear Algebra）是数学的一个分支，它的研究对象是向量，向量空间（或称线性空间），线性变换和有限维的线性方程组。向量空间是现代数学的一个重要课题；因而，线性代数被广泛地应用于抽象代数和泛函分析中；通过解析几何，线性代数得以被具体表示。线性代数的理论已被泛化为算子理论。由于科学研究中的非线性模型通常可以被近似为线性模型，使得线性代数被广泛地应用于自然科学和社会科学中。 - 题图来自「维基百科」。

【2023新书】随机模型基础，815页pdf

专知会员服务

105+阅读 · 2023年5月10日

【硬核书】稀疏多项式优化:理论与实践，220页pdf

专知会员服务

73+阅读 · 2022年9月30日

【硬核书】矩阵代数基础，248页pdf

专知会员服务

88+阅读 · 2021年12月9日

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

专知会员服务

434+阅读 · 2021年1月11日