Predicting high-dimensional transcriptional responses to genetic perturbations is challenging due to severe experimental noise and sparse gene-level effects. Existing methods often suffer from mean collapse, where high correlation is achieved by predicting global average expression rather than perturbation-specific responses, leading to many false positives and limited biological interpretability. Recent approaches incorporate biological knowledge graphs into perturbation models, but these graphs are typically treated as dense and static, which can propagate noise and obscure true perturbation signals. We propose AdaPert, a perturbation-conditioned framework that addresses mean collapse by explicitly modeling sparsity and biological structure. AdaPert learns perturbation-specific subgraphs from biological knowledge graphs and applies adaptive learning to separate true signals from noise. Across multiple genetic perturbation benchmarks, AdaPert consistently outperforms existing baselines and achieves substantial improvements on DEG-aware evaluation metrics, indicating more accurate recovery of perturbation-specific transcriptional changes.
翻译:预测遗传扰动下的高维转录反应具有挑战性,主要源于严重的实验噪声和稀疏的基因水平效应。现有方法常受均值崩溃问题困扰,即通过预测全局平均表达而非扰动特异性反应来获得高相关性,导致大量假阳性结果并限制了生物学可解释性。近期方法将生物学知识图谱整合到扰动模型中,但这些图谱通常被视为密集且静态的,可能传播噪声并掩盖真实的扰动信号。我们提出AdaPert,一种扰动条件框架,通过显式建模稀疏性和生物结构来解决均值崩溃问题。AdaPert从生物学知识图谱中学习扰动特异性子图,并应用自适应学习来区分真实信号与噪声。在多个遗传扰动基准测试中,AdaPert始终优于现有基线方法,并在差异表达基因感知评估指标上取得显著提升,表明其能更准确地恢复扰动特异性的转录变化。