Single-cell RNA sequencing has transformed biology by enabling the measurement of gene expression at cellular resolution, providing information for cell types, states, and disease contexts. Recently, single-cell foundation models have emerged as powerful tools for learning transferable representations directly from expression profiles, improving performance on classification and clustering tasks. However, these models are limited to discrete prediction heads, which collapse cellular complexity into predefined labels that fail to capture the richer, contextual explanations biologists need. We introduce Cell2Text, a multimodal generative framework that translates scRNA-seq profiles into structured natural language descriptions. By integrating gene-level embeddings from single-cell foundation models with pretrained large language models, Cell2Text generates coherent summaries that capture cellular identity, tissue origin, disease associations, and pathway activity, generalizing to unseen cells. Empirically, Cell2Text outperforms baselines on classification accuracy, demonstrates strong ontological consistency using PageRank-based similarity metrics, and achieves high semantic fidelity in text generation. These results demonstrate that coupling expression data with natural language offers both stronger predictive performance and inherently interpretable outputs, pointing to a scalable path for label-efficient characterization of unseen cells.
翻译:单细胞RNA测序技术通过实现细胞分辨率下的基因表达测量,彻底改变了生物学研究领域,为细胞类型、状态及疾病背景提供了关键信息。近年来,单细胞基础模型已成为直接从表达谱中学习可迁移表征的强大工具,显著提升了分类与聚类任务的性能。然而,这些模型受限于离散的预测头结构,将细胞复杂性压缩为预定义的标签,无法捕捉生物学家所需的更丰富、更具情境化的解释。本文提出Cell2Text——一个将单细胞RNA测序图谱转化为结构化自然语言描述的多模态生成框架。通过整合单细胞基础模型的基因级嵌入与预训练大语言模型,Cell2Text能够生成涵盖细胞身份、组织来源、疾病关联及通路活性的连贯摘要,并具备对未见细胞的泛化能力。实验表明,Cell2Text在分类准确率上超越基线模型,基于PageRank的相似度度量显示其具有强大的本体一致性,并在文本生成中实现了高语义保真度。这些结果证明,将表达数据与自然语言相结合不仅能提供更强的预测性能,还能产生本质可解释的输出,为未见细胞的高效标注表征开辟了可扩展的技术路径。