Genome annotation is essential for understanding the functional elements within genomes. While automated methods are indispensable for processing large-scale genomic data, they often face challenges in accurately predicting gene structures and functions. Consequently, manual curation by domain experts remains crucial for validating and refining these predictions. These combined outcomes from automated tools and manual curation highlight the importance of integrating human expertise with AI capabilities to improve both the accuracy and efficiency of genome annotation. However, the manual curation process is inherently labor-intensive and time-consuming, making it difficult to scale for large datasets. To address these challenges, we propose a conceptual framework, Human-AI Collaborative Genome Annotation (HAICoGA), which leverages the synergistic partnership between humans and artificial intelligence to enhance human capabilities and accelerate the genome annotation process. Additionally, we explore the potential of integrating Large Language Models (LLMs) into this framework to support and augment specific tasks. Finally, we discuss emerging challenges and outline open research questions to guide further exploration in this area.
翻译:基因组注释对于理解基因组中的功能元件至关重要。虽然自动化方法对于处理大规模基因组数据不可或缺,但其在准确预测基因结构与功能方面常常面临挑战。因此,领域专家的人工审阅对于验证和完善这些预测仍然至关重要。自动化工具与人工审阅相结合的结果凸显了将人类专业知识与AI能力相结合以提高基因组注释准确性和效率的重要性。然而,人工审阅过程本质上是劳动密集且耗时的,难以扩展到大型数据集。为应对这些挑战,我们提出了一个概念框架——人类-AI协作基因组注释(HAICoGA),该框架利用人类与人工智能之间的协同伙伴关系,以增强人类能力并加速基因组注释过程。此外,我们探讨了将大型语言模型(LLMs)集成到此框架中以支持和增强特定任务的潜力。最后,我们讨论了新兴挑战,并概述了开放的研究问题,以指导该领域的进一步探索。