Multi-task learning is effective for related applications, but its performance can deteriorate when the target sample size is small. Transfer learning can borrow strength from related studies; yet, many existing methods rely on restrictive bounded-difference assumptions between the source and target models. We propose SMART, a spectral transfer method for multi-task linear regression that instead assumes spectral similarity: the target left and right singular subspaces lie within the corresponding source subspaces and are sparsely aligned with the source singular bases. Such an assumption is natural when studies share latent structures and enables transfer beyond the bounded-difference settings. SMART estimates the target coefficient matrix through structured regularization that incorporates spectral information from a source study. Importantly, it requires only a fitted source model rather than the raw source data, making it useful when data sharing is limited. Although the optimization problem is nonconvex, we develop a practical ADMM-based algorithm. We establish general, non-asymptotic error bounds and a minimax lower bound in the noiseless-source regime. Under additional regularity conditions, these results yield near-minimax Frobenius error rates up to logarithmic factors. Simulations confirm improved estimation accuracy and robustness to negative transfer, and analysis of multi-modal single-cell data demonstrates better predictive performance. The Python implementation of SMART, along with the code to reproduce all experiments in this paper, is publicly available at https://github.com/boxinz17/smart.
翻译:多任务学习在相关应用中效果显著,但当目标样本量较小时,其性能可能下降。迁移学习能够借助相关研究的优势,然而现有许多方法依赖于源模型与目标模型之间严格的有界差异假设。我们提出SMART,一种面向多任务线性回归的谱迁移方法,该方法采用谱相似性假设:目标左、右奇异子空间分别包含于对应源子空间中,并与源奇异基稀疏对齐。当研究共享潜在结构时,该假设具备自然合理性,并使得迁移能够突破有界差异场景的限制。SMART通过结构化正则化估计目标系数矩阵,该过程整合了源研究的谱信息。关键在于,该方法仅需已拟合的源模型而非原始源数据,因此特别适用于数据共享受限的场景。尽管优化问题非凸,我们开发了实用的基于ADMM的算法。在无噪声源环境下,我们建立了普适的非渐近误差界以及极小化极大下界。在额外正则性条件下,这些结果可导出接近最优的Frobenius误差率(仅对数因子有差异)。仿真实验验证了该方法在估计精度提升及负迁移鲁棒性方面的优势,多模态单细胞数据分析则展示了其更优的预测性能。SMART的Python实现以及所有实验复现代码已公开于https://github.com/boxinz17/smart。