Single-cell omics technologies have transformed our understanding of cellular diversity by enabling high-resolution profiling of individual cells. However, the unprecedented scale and heterogeneity of these datasets demand robust frameworks for data integration and annotation. The Cell Ontology (CL) has emerged as a pivotal resource for achieving FAIR (Findable, Accessible, Interoperable, and Reusable) data principles by providing standardized, species-agnostic terms for canonical cell types - forming a core component of a wide range of platforms and tools. In this paper, we describe the wide variety of uses of CL in these platforms and tools and detail ongoing work to improve and extend CL content including the addition of transcriptomic types, working closely with major atlasing efforts including the Human Cell Atlas and the Brain Initiative Cell Atlas Network to support their needs. We cover the challenges and future plans for harmonising classical and transcriptomic cell type definitions, integrating markers and using Large Language Models (LLMs) to improve content and efficiency of CL workflows.
翻译:单细胞组学技术通过实现对单个细胞的高分辨率分析,彻底改变了我们对细胞多样性的理解。然而,这些数据集前所未有的规模和异质性要求建立强大的数据整合与注释框架。细胞本体论已成为实现FAIR(可发现、可访问、可互操作、可重用)数据原则的关键资源,它为标准化的、物种无关的经典细胞类型提供术语,构成了众多平台和工具的核心组成部分。本文描述了CL在这些平台和工具中的广泛应用,并详细介绍了为改进和扩展CL内容(包括添加转录组学类型)而正在开展的工作。我们与包括人类细胞图谱和脑计划细胞图谱网络在内的主要图谱计划密切合作,以支持其需求。我们讨论了协调经典细胞类型定义与转录组学细胞类型定义、整合标记物以及利用大型语言模型提升CL工作流程内容与效率所面临的挑战及未来规划。