The rapid advancement of large language models (LLMs), represented by OpenAI's GPT series, has significantly impacted various domains such as natural language processing, software development, education, healthcare, finance, and scientific research. However, OpenAI APIs introduce unique challenges that differ from traditional APIs, such as the complexities of prompt engineering, token-based cost management, non-deterministic outputs, and operation as black boxes. To the best of our knowledge, the challenges developers encounter when using OpenAI APIs have not been explored in previous empirical studies. To fill this gap, we conduct the first comprehensive empirical study by analyzing 2,874 OpenAI API-related discussions from the popular Q&A forum Stack Overflow. We first examine the popularity and difficulty of these posts. After manually categorizing them into nine OpenAI API-related categories, we identify specific challenges associated with each category through topic modeling analysis. Based on our empirical findings, we finally propose actionable implications for developers, LLM vendors, and researchers.
翻译:以OpenAI的GPT系列为代表的大型语言模型(LLM)的快速发展,已对自然语言处理、软件开发、教育、医疗、金融和科学研究等多个领域产生了显著影响。然而,OpenAI API带来了不同于传统API的独特挑战,例如提示工程的复杂性、基于令牌的成本管理、非确定性输出以及作为黑盒运行等问题。据我们所知,开发人员在使用OpenAI API时遇到的挑战尚未在以往的实证研究中得到探讨。为填补这一空白,我们通过分析来自热门问答论坛Stack Overflow的2,874条OpenAI API相关讨论,开展了首次全面的实证研究。我们首先考察了这些帖子的流行度和难度。在将其手动归类为九个与OpenAI API相关的类别后,我们通过主题建模分析识别了每个类别所对应的具体挑战。基于我们的实证发现,我们最终为开发人员、LLM供应商和研究人员提出了可行的建议。