Large Language Models (LLMs) are revolutionizing bioinformatics, enabling advanced analysis of DNA, RNA, proteins, and single-cell data. This survey provides a systematic review of recent advancements, focusing on genomic sequence modeling, RNA structure prediction, protein function inference, and single-cell transcriptomics. Meanwhile, we also discuss several key challenges, including data scarcity, computational complexity, and cross-omics integration, and explore future directions such as multimodal learning, hybrid AI models, and clinical applications. By offering a comprehensive perspective, this paper underscores the transformative potential of LLMs in driving innovations in bioinformatics and precision medicine.
翻译:大型语言模型(LLMs)正在革新生物信息学领域,为DNA、RNA、蛋白质及单细胞数据的高级分析提供了可能。本综述系统回顾了该领域的最新进展,重点关注基因组序列建模、RNA结构预测、蛋白质功能推断以及单细胞转录组学。同时,我们也讨论了若干关键挑战,包括数据稀缺性、计算复杂性以及跨组学整合,并探讨了未来发展方向,如多模态学习、混合人工智能模型及临床应用。通过提供全面的视角,本文强调了大型语言模型在推动生物信息学与精准医学创新方面的变革性潜力。