Large language models (LLMs) have been touted to enable increased productivity in many areas of today's work life. Scientific research as an area of work is no exception: the potential of LLM-based tools to assist in the daily work of scientists has become a highly discussed topic across disciplines. However, we are only at the very onset of this subject of study. It is still unclear how the potential of LLMs will materialise in research practice. With this study, we give first empirical evidence on the use of LLMs in the research process. We have investigated a set of use cases for LLM-based tools in scientific research, and conducted a first study to assess to which degree current tools are helpful. In this paper we report specifically on use cases related to software engineering, such as generating application code and developing scripts for data analytics. While we studied seemingly simple use cases, results across tools differ significantly. Our results highlight the promise of LLM-based tools in general, yet we also observe various issues, particularly regarding the integrity of the output these tools provide.
翻译:大型语言模型(LLMs)被誉为其将在现代工作生活的诸多领域提升生产力。科研领域作为工作领域之一亦不例外:基于LLM的工具在协助科学家日常工作中的潜力已成为跨学科高度热议的话题。然而,我们仍处于这一研究方向的初始阶段。LLMs的潜力将如何在研究实践中具体落实尚不明确。本研究首次提供了LLMs在研究过程中使用的实证证据。我们探讨了基于LLM工具在科学研究中的一系列用例,并开展了初步研究以评估现有工具的实用程度。本文重点报告与软件工程相关的用例,包括生成应用程序代码和开发数据分析脚本。尽管我们研究的看似简单用例,但不同工具的结果存在显著差异。我们的研究结果虽凸显了基于LLM工具的总体前景,但也观察到诸多问题,特别是这些工具所生成输出的完整性问题。