We study the problem of learning multi-index models (MIMs), where the label depends on the input $\boldsymbol{x} \in \mathbb{R}^d$ only through an unknown $\mathsf{s}$-dimensional projection $\boldsymbol{W}_*^\mathsf{T} \boldsymbol{x} \in \mathbb{R}^\mathsf{s}$. Exploiting the equivariance of this problem under the orthogonal group $\mathcal{O}_d$, we obtain a sharp harmonic-analytic characterization of the learning complexity for MIMs with spherically symmetric inputs -- which refines and generalizes previous Gaussian-specific analyses. Specifically, we derive statistical and computational complexity lower bounds within the Statistical Query (SQ) and Low-Degree Polynomial (LDP) frameworks. These bounds decompose naturally across spherical harmonic subspaces. Guided by this decomposition, we construct a family of spectral algorithms based on harmonic tensor unfolding that sequentially recover the latent directions and (nearly) achieve these SQ and LDP lower bounds. Depending on the choice of harmonic degree sequence, these estimators can realize a broad range of trade-offs between sample and runtime complexity. From a technical standpoint, our results build on the semisimple decomposition of the $\mathcal{O}_d$-action on $L^2 (\mathbb{S}^{d-1})$ and the intertwining isomorphism between spherical harmonics and traceless symmetric tensors.
翻译:本研究探讨多指标模型的学习问题,其中标签仅通过未知的$\mathsf{s}$维投影$\boldsymbol{W}_*^\mathsf{T} \boldsymbol{x} \in \mathbb{R}^\mathsf{s}$依赖于输入$\boldsymbol{x} \in \mathbb{R}^d$。通过利用该问题在正交群$\mathcal{O}_d$作用下的等变性,我们获得了球对称输入多指标模型学习复杂度的精确调和分析表征——该结果改进并推广了先前针对高斯分布的特异性分析。具体而言,我们在统计查询与低次多项式框架下推导了统计与计算复杂度的下界,这些下界可自然地按球谐子空间分解。基于此分解结构,我们构建了基于调和张量展开的谱算法族,该算法族能顺序恢复潜在方向并(近似)达到统计查询与低次多项式下界。根据调和次数序列的选择,这些估计器可实现样本复杂度与时间复杂度之间广泛的权衡关系。从技术层面看,我们的结果建立在$\mathcal{O}_d$作用于$L^2 (\mathbb{S}^{d-1})$的半单分解理论,以及球谐函数与无迹对称张量间的交织同构基础之上。