With the advent of large language models (LLMs), there is a growing interest in applying LLMs to scientific tasks. In this work, we conduct an experimental study to explore applicability of LLMs for configuring, annotating, translating, explaining, and generating scientific workflows. We use 5 different workflow specific experiments and evaluate several open- and closed-source language models using state-of-the-art workflow systems. Our studies reveal that LLMs often struggle with workflow related tasks due to their lack of knowledge of scientific workflows. We further observe that the performance of LLMs varies across experiments and workflow systems. Our findings can help workflow developers and users in understanding LLMs capabilities in scientific workflows, and motivate further research applying LLMs to workflows.
翻译:随着大型语言模型(LLMs)的出现,人们对其在科学任务中的应用兴趣日益增长。本研究通过实验探讨LLMs在配置、标注、转换、解释和生成科学工作流方面的适用性。我们设计了5项针对工作流的实验,并利用前沿工作流系统评估了多个开源与闭源语言模型。研究发现,由于缺乏对科学工作流的认知,LLMs在处理相关工作流任务时常常面临困难。我们进一步观察到,LLMs在不同实验和工作流系统中的表现存在差异。这些发现有助于工作流开发者与用户理解LLMs在科学工作流中的能力,并推动将LLMs应用于工作流的进一步研究。