With the advent of large language models (LLMs), there is a growing interest in applying LLMs to scientific tasks. In this work, we conduct an experimental study to explore applicability of LLMs for configuring, annotating, translating, explaining, and generating scientific workflows. We use 5 different workflow specific experiments and evaluate several open- and closed-source language models using state-of-the-art workflow systems. Our studies reveal that LLMs often struggle with workflow related tasks due to their lack of knowledge of scientific workflows. We further observe that the performance of LLMs varies across experiments and workflow systems. Our findings can help workflow developers and users in understanding LLMs capabilities in scientific workflows, and motivate further research applying LLMs to workflows.
翻译:随着大型语言模型(LLMs)的出现,将LLMs应用于科学任务的兴趣日益增长。在本研究中,我们通过实验探索LLMs在配置、标注、转换、解释和生成科学工作流方面的适用性。我们设计了5项针对工作流的特定实验,并利用最先进的工作流系统评估了多个开源和闭源语言模型。我们的研究表明,由于缺乏对科学工作流的了解,LLMs在处理相关工作流任务时常常遇到困难。我们还观察到,LLMs在不同实验和工作流系统中的表现存在差异。这些发现有助于工作流开发者和用户理解LLMs在科学工作流中的能力,并推动将LLMs应用于工作流的进一步研究。