The study of biological materials and bio-inspired materials science is well established; however, surprisingly little knowledge has been systematically translated to engineering solutions. To accelerate discovery and guide insights, an open-source autoregressive transformer large language model (LLM), BioinspiredLLM, is reported. The model was finetuned with a corpus of over a thousand peer-reviewed articles in the field of structural biological and bio-inspired materials and can be prompted to recall information, assist with research tasks, and function as an engine for creativity. The model has proven that it is able to accurately recall information about biological materials and is further enhanced with enhanced reasoning ability, as well as with retrieval-augmented generation to incorporate new data during generation that can also help to traceback sources, update the knowledge base, and connect knowledge domains. BioinspiredLLM also has been shown to develop sound hypotheses regarding biological materials design and remarkably so for materials that have never been explicitly studied before. Lastly, the model showed impressive promise in collaborating with other generative artificial intelligence models in a workflow that can reshape the traditional materials design process. This collaborative generative artificial intelligence method can stimulate and enhance bio-inspired materials design workflows. Biological materials are at a critical intersection of multiple scientific fields and models like BioinspiredLLM help to connect knowledge domains.
翻译:生物材料与仿生材料科学领域虽已发展成熟,但令人惊讶的是,系统性地将这些知识转化为工程解决方案的成果仍然有限。为加速科学发现并引导创新见解,本文报道了一个开源自回归Transformer大语言模型——BioinspiredLLM。该模型基于上千篇结构生物材料与仿生材料领域的同行评审文献语料库进行微调,能响应提示进行信息检索、辅助研究任务,并充当创意引擎。实验证明,该模型不仅能准确回忆生物材料相关信息,还具备增强的推理能力,并通过检索增强生成技术实现生成过程中融入新数据,从而追溯来源、更新知识库并连接知识领域。研究显示,BioinspiredLLM能够针对生物材料设计提出合理假设,尤其对从未被明确研究的材料展现出显著洞察力。此外,该模型在与其他生成式人工智能模型协作的流程中展现出惊人潜力,有望重塑传统材料设计范式。这种协作式生成式人工智能方法能激发并增强仿生材料设计流程。生物材料正处于多个科学领域的交叉点,而像BioinspiredLLM这样的模型有助于连接不同知识领域。