Recent advancements in Large Language Models (LLMs) are transforming biology, computer science, engineering, and every day life. However, integrating the wide array of computational tools, databases, and scientific literature continues to pose a challenge to biological research. LLMs are well-suited for unstructured integration, efficient information retrieval, and automating standard workflows and actions from these diverse resources. To harness these capabilities in bioinformatics, we present a prototype Bioinformatics Retrieval Augmented Digital assistant (BRAD). BRAD is a chatbot and agentic system that integrates a variety of bioinformatics tools. The Python package implements an AI \texttt{Agent} that is powered by LLMs and connects to a local file system, online databases, and a user's software. The \texttt{Agent} is highly configurable, enabling tasks such as Retrieval-Augmented Generation, searches across bioinformatics databases, and the execution of software pipelines. BRAD's coordinated integration of bioinformatics tools delivers a context-aware and semi-autonomous system that extends beyond the capabilities of conventional LLM-based chatbots. A graphical user interface (GUI) provides an intuitive interface to the system.
翻译:近年来,大型语言模型(LLMs)的进展正在改变生物学、计算机科学、工程学及日常生活。然而,如何整合广泛的计算工具、数据库和科学文献,仍然是生物学研究面临的挑战。LLMs非常适合对这些多样化资源进行非结构化整合、高效信息检索,以及自动化标准工作流程和操作。为了在生物信息学中利用这些能力,我们提出了一个原型系统——生物信息学检索增强数字助手(BRAD)。BRAD是一个集成了多种生物信息学工具的聊天机器人及智能体系统。该Python软件包实现了一个由LLMs驱动的AI智能体(Agent),可连接本地文件系统、在线数据库以及用户软件。该智能体具有高度可配置性,能够执行检索增强生成、跨生物信息学数据库搜索以及运行软件流水线等任务。BRAD对生物信息学工具的协调整合,提供了一个具有情境感知能力和半自主性的系统,其功能超越了传统的基于LLM的聊天机器人。图形用户界面(GUI)则为该系统提供了一个直观的操作接口。