Single-cell foundation models learn by reconstructing masked gene expression, implicitly treating technical noise as signal. With dropout rates exceeding 90%, reconstruction objectives encourage models to encode measurement artifacts rather than stable cellular programs. We introduce Cell-JEPA, a joint-embedding predictive architecture that shifts learning from reconstructing sparse counts to predicting in latent space. The key insight is that cell identity is redundantly encoded across genes. We show predicting cell-level embeddings from partial observations forces the model to learn dropout-robust features. On cell-type clustering, Cell-JEPA achieves 0.72 AvgBIO in zero-shot transfer versus 0.53 for scGPT, a 36% relative improvement. On perturbation prediction within a single cell line, Cell-JEPA improves absolute-state reconstruction but not effect-size estimation, suggesting that representation learning and perturbation modeling address complementary aspects of cellular prediction.
翻译:单细胞基础模型通过重建被遮蔽的基因表达进行学习,这隐含地将技术噪声视为信号。在丢失率超过90%的情况下,重建目标会促使模型编码测量伪影,而非稳定的细胞程序。我们提出了Cell-JEPA,一种联合嵌入预测架构,它将学习重点从重建稀疏计数转向在潜在空间中进行预测。其关键洞见在于,细胞身份在多个基因间存在冗余编码。我们证明,从部分观测中预测细胞级别的嵌入会迫使模型学习对丢失鲁棒的特征。在细胞类型聚类任务中,Cell-JEPA在零样本迁移上实现了0.72的AvgBIO,而scGPT为0.53,相对提升了36%。在单个细胞系内的扰动预测任务中,Cell-JEPA改善了绝对状态的重建,但未提升效应大小的估计,这表明表征学习与扰动建模处理的是细胞预测中互补的方面。