No-Code Development Platforms (NCDPs) empower non-technical end users to build applications tailored to their specific demands without writing code. While NCDPs lower technical barriers, users still require some technical knowledge, e.g., to structure process steps or define event-action rules. Large Language Models (LLMs) offer a promising solution to further reduce technical requirements by supporting natural language interaction and dynamic code generation. By integrating LLM, NCDPs can be more accessible to non-technical users, enabling application development truly without requiring any technical expertise. Despite growing interest in LLM-powered NCDPs, a systematic investigation into the factors influencing LLM suitability and performance remains absent. Understanding these factors is critical to effectively leveraging LLMs capabilities and maximizing their impact. In this paper, we investigate key factors influencing the effectiveness of LLMs in supporting end-user application development within NCDPs. By conducting comprehensive experiments, we evaluate the impact of four key factors, i.e., model selection, prompt language, training data background, and an error-informed few-shot setup, on the quality of generated applications. Specifically, we selected a range of LLMs based on their architecture, scale, design focus, and training data, and evaluated them across four real-world smart home automation scenarios implemented on a representative open-source LLM-powered NCDP. Our findings offer practical insights into how LLMs can be effectively integrated into NCDPs, informing both platform design and the selection of suitable LLMs for end-user application development.
翻译:无代码开发平台使非技术终端用户无需编写代码即可构建满足其特定需求的应用程序。尽管无代码开发平台降低了技术门槛,用户仍需具备一定的技术知识,例如构建流程步骤或定义事件-动作规则。大语言模型通过支持自然语言交互和动态代码生成,为进一步降低技术要求提供了前景广阔的解决方案。通过集成大语言模型,无代码开发平台对非技术用户将更具可及性,实现真正无需任何技术专长的应用开发。尽管人们对大语言模型驱动的无代码开发平台兴趣日益增长,但关于影响大语言模型适用性与性能因素的系统性研究仍然缺失。理解这些因素对于有效利用大语言模型能力并最大化其影响至关重要。本文研究了在无代码开发平台中支持终端用户应用开发时,影响大语言模型有效性的关键因素。通过开展综合实验,我们评估了四个关键因素——模型选择、提示语言、训练数据背景以及基于错误信息的少样本设置——对生成应用质量的影响。具体而言,我们根据架构、规模、设计重点和训练数据选择了一系列大语言模型,并在具有代表性的开源大语言模型驱动的无代码开发平台上,通过四个真实世界智能家居自动化场景对它们进行评估。我们的研究结果为如何将大语言模型有效集成到无代码开发平台提供了实践见解,为平台设计及选择适合终端用户应用开发的大语言模型提供了参考依据。