Precision matrix estimation is essential in various fields; yet it is challenging when samples for the target study are limited. Transfer learning can enhance estimation accuracy by leveraging data from related source studies. We propose Trans-Glasso, a two-step transfer learning method for precision matrix estimation. First, we obtain initial estimators using a multi-task learning objective that captures shared and unique features across studies. Then, we refine these estimators through differential network estimation to adjust for structural differences between the target and source precision matrices. Under the assumption that most entries of the target precision matrix are shared with source matrices, we derive non-asymptotic error bounds and show that Trans-Glasso achieves minimax optimality under certain conditions. Extensive simulations demonstrate Trans Glasso's superior performance compared to baseline methods, particularly in small-sample settings. We further validate Trans-Glasso in applications to gene networks across brain tissues and protein networks for various cancer subtypes, showcasing its effectiveness in biological contexts. Additionally, we derive the minimax optimal rate for differential network estimation, representing the first such guarantee in this area. The Python implementation of Trans-Glasso, along with code to reproduce all experiments in this paper, is publicly available at https://github.com/boxinz17/transglasso-experiments.
翻译:精度矩阵估计在多个领域至关重要,但当目标研究的样本有限时,这一任务具有挑战性。迁移学习可以利用相关源研究的数据来提高估计精度。我们提出Trans-Glasso,一种用于精度矩阵估计的两步迁移学习方法。首先,我们使用一个捕捉研究间共享与独特特征的多任务学习目标获得初始估计量。然后,通过差分网络估计对这些估计量进行精炼,以调整目标精度矩阵与源精度矩阵之间的结构差异。在目标精度矩阵的大部分元素与源矩阵共享的假设下,我们推导出非渐近误差界,并证明Trans-Glasso在特定条件下达到了极小极大最优性。大量模拟实验表明,与基线方法相比,Trans-Glasso表现更优,尤其在样本量较小的场景中。我们进一步在跨脑组织的基因网络和多种癌症亚型的蛋白质网络应用中验证了Trans-Glasso,展示了其在生物学情境中的有效性。此外,我们推导了差分网络估计的极小极大最优速率,这是该领域首次给出此类保证。Trans-Glasso的Python实现以及复现本文所有实验的代码已在https://github.com/boxinz17/transglasso-experiments 公开提供。