Comprehending genomic information is essential for biomedical research, yet extracting data from complex distributed databases remains challenging. Large language models (LLMs) offer potential for genomic Question Answering (QA) but face limitations due to restricted access to domain-specific databases. GeneGPT is the current state-of-the-art system that enhances LLMs by utilizing specialized API calls, though it is constrained by rigid API dependencies and limited adaptability. We replicate GeneGPT and propose GenomAgent, a multi-agent framework that efficiently coordinates specialized agents for complex genomics queries. Evaluated on nine tasks from the GeneTuring benchmark, GenomAgent outperforms GeneGPT by 12% on average, and its flexible architecture extends beyond genomics to various scientific domains needing expert knowledge extraction.
翻译:理解基因组信息对于生物医学研究至关重要,然而从复杂分布式数据库中提取数据仍然具有挑战性。大语言模型(LLMs)为基因组学问答提供了潜力,但由于对特定领域数据库的访问受限而面临局限。GeneGPT是目前最先进的系统,它通过利用专门的API调用来增强LLMs,但其受限于僵化的API依赖性和有限的适应性。我们复现了GeneGPT并提出了GenomAgent,这是一个多智能体框架,能够高效协调专门智能体以处理复杂的基因组学查询。在GeneTuring基准测试的九项任务中评估,GenomAgent平均优于GeneGPT 12%,其灵活的架构可扩展至基因组学以外,适用于需要专家知识提取的各种科学领域。