This note provides a lightweight tutorial on using Eigen, a C++ template library for linear algebra, to implement statistical and machine learning algorithms. The emphasis is practical rather than methodological: we show how common matrix operations, decomposition-based solvers, and vectorized updates can be written in readable C++ and then connected to Python through pybind11. Two examples are used throughout the tutorial: kernel ridge regression and matrix factorization with stochastic gradient descent. The examples are intentionally small enough to be studied as code, but they contain many operations that appear in larger research software projects, including kernel matrix construction, regularized linear system solving, row-wise updates, and NumPy--Eigen data conversion. The goal is to provide a reproducible starting point for researchers who want to move from mathematical formulas to efficient C++ implementations while retaining a convenient Python workflow.
翻译:本笔记提供了关于使用Eigen(一个用于线性代数的C++模板库)来实现统计和机器学习算法的轻量级教程。其侧重点在于实践而非方法论:我们展示了如何用可读性强的C++编写常见的矩阵运算、基于分解的求解器以及向量化更新,并通过pybind11连接到Python。整个教程贯穿了两个示例:核岭回归和基于随机梯度下降的矩阵分解。这些示例特意设计得足够小巧,以便作为代码进行学习,但它们包含了大型研究软件项目中出现的许多操作,包括核矩阵构建、正则化线性系统求解、按行更新以及NumPy与Eigen之间的数据转换。目标是为那些希望从数学公式转向高效C++实现,同时保留便捷Python工作流程的研究人员提供一个可复现的起点。