As Large Language Models (LLMs) rapidly evolve, their influence in science is becoming increasingly prominent. The emerging capabilities of LLMs in task generalization and free-form dialogue can significantly advance fields like chemistry and biology. However, the field of single-cell biology, which forms the foundational building blocks of living organisms, still faces several challenges. High knowledge barriers and limited scalability in current methods restrict the full exploitation of LLMs in mastering single-cell data, impeding direct accessibility and rapid iteration. To this end, we introduce ChatCell, which signifies a paradigm shift by facilitating single-cell analysis with natural language. Leveraging vocabulary adaptation and unified sequence generation, ChatCell has acquired profound expertise in single-cell biology and the capability to accommodate a diverse range of analysis tasks. Extensive experiments further demonstrate ChatCell's robust performance and potential to deepen single-cell insights, paving the way for more accessible and intuitive exploration in this pivotal field. Our project homepage is available at https://zjunlp.github.io/project/ChatCell.
翻译:随着大型语言模型(LLMs)的快速发展,其在科学领域的影响力日益显著。LLMs在任务泛化与自由形式对话方面展现的新兴能力,可有力推动化学与生物学等领域的进步。然而,作为生命基本组成单元的单细胞生物学领域仍面临多重挑战。当前方法存在的高知识门槛与有限可扩展性,制约了LLMs在单细胞数据掌握中的潜能发挥,阻碍了直接可访问性与快速迭代。为此,我们提出ChatCell,通过自然语言实现单细胞分析,标志着该领域的研究范式转变。通过引入词汇自适应与统一序列生成技术,ChatCell不仅习得了单细胞生物学的专业知识,更具备适应多样化分析任务的能力。广泛实验表明,ChatCell展现出稳健的性能与深化单细胞认知的潜力,为这一关键领域更具可及性与直觉性的探索铺平道路。我们的项目主页为https://zjunlp.github.io/project/ChatCell。