Unveiling ChatGPT's Usage in Open Source Projects: A Mining-based Study

Large Language Models (LLMs) have gained significant attention in the software engineering community. Nowadays developers have the possibility to exploit these models through industrial-grade tools providing a handy interface toward LLMs, such as OpenAI's ChatGPT. While the potential of LLMs in assisting developers across several tasks has been documented in the literature, there is a lack of empirical evidence mapping the actual usage of LLMs in software projects. In this work, we aim at filling such a gap. First, we mine 1,501 commits, pull requests (PRs), and issues from open-source projects by matching regular expressions likely to indicate the usage of ChatGPT to accomplish the task. Then, we manually analyze these instances, discarding false positives (i.e., instances in which ChatGPT was mentioned but not actually used) and categorizing the task automated in the 467 true positive instances (165 commits, 159 PRs, 143 issues). This resulted in a taxonomy of 45 tasks which developers automate via ChatGPT. The taxonomy, accompanied with representative examples, provides (i) developers with valuable insights on how to exploit LLMs in their workflow and (ii) researchers with a clear overview of tasks that, according to developers, could benefit from automated solutions.

翻译：大型语言模型（LLMs）在软件工程领域引起了广泛关注。如今，开发者能够通过工业级工具（如OpenAI的ChatGPT）为LLMs提供便捷的接口以利用这些模型。尽管文献中已记录了LLMs在协助开发者完成多项任务方面的潜力，但关于LLMs在软件项目中实际使用情况的实证证据仍然缺乏。本研究旨在填补这一空白。首先，我们通过匹配可能表明使用ChatGPT完成任务的正则表达式，从开源项目中挖掘了1501个提交、拉取请求（PRs）和议题。接着，我们对这些实例进行人工分析，剔除误报（即提及ChatGPT但实际未使用的实例），并对467个真阳性实例（165个提交、159个PRs、143个议题）中的自动化任务进行分类。最终形成了开发者通过ChatGPT自动化的45项任务分类体系。该分类体系附带代表性示例，为（i）开发者提供了如何在工作流程中利用LLMs的宝贵见解，同时为（ii）研究者提供了开发者认为可从自动化解决方案中受益的任务的清晰概览。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日