We investigate random matrices whose entries are obtained by applying a nonlinear kernel function to pairwise inner products between $n$ independent data vectors, drawn uniformly from the unit sphere in $\mathbb{R}^d$. This study is motivated by applications in machine learning and statistics, where these kernel random matrices and their spectral properties play significant roles. We establish the weak limit of the empirical spectral distribution of these matrices in a polynomial scaling regime, where $d, n \to \infty$ such that $n / d^\ell \to \kappa$, for some fixed $\ell \in \mathbb{N}$ and $\kappa \in (0, \infty)$. Our findings generalize an earlier result by Cheng and Singer, who examined the same model in the linear scaling regime (with $\ell = 1$). Our work reveals an equivalence principle: the spectrum of the random kernel matrix is asymptotically equivalent to that of a simpler matrix model, constructed as a linear combination of a (shifted) Wishart matrix and an independent matrix sampled from the Gaussian orthogonal ensemble. The aspect ratio of the Wishart matrix and the coefficients of the linear combination are determined by $\ell$ and the expansion of the kernel function in the orthogonal Hermite polynomial basis. Consequently, the limiting spectrum of the random kernel matrix can be characterized as the free additive convolution between a Marchenko-Pastur law and a semicircle law. We also extend our results to cases with data vectors sampled from isotropic Gaussian distributions instead of spherical distributions.
翻译:本文研究一类随机矩阵,其元素由$n$个独立数据向量(均匀取自$\mathbb{R}^d$单位球面)两两内积经非线性核函数作用后得到。该研究源于机器学习和统计学应用,其中这类核随机矩阵及其谱性质扮演重要角色。我们在多项式标度机制下建立了这些矩阵经验谱分布的弱极限,此时$d, n \to \infty$满足$n / d^\ell \to \kappa$,其中$\ell \in \mathbb{N}$和$\kappa \in (0, \infty)$为固定参数。我们的发现推广了Cheng与Singer在线性标度机制(取$\ell = 1$)下对同一模型的研究成果。本文揭示了一个等价原理:随机核矩阵的谱渐近等价于一个更简单矩阵模型的谱,该模型由(平移后的)Wishart矩阵与独立高斯正交系综采样矩阵的线性组合构成。Wishart矩阵的纵横比及线性组合系数由$\ell$与核函数在正交Hermite多项式基上的展开共同决定。因此,随机核矩阵的极限谱可表征为Marchenko-Pastur律与半圆律的自由加法卷积。我们还将结果推广至数据向量采样自各向同性高斯分布(而非球面分布)的情形。