In recent years, generative AI has undergone major advancements, demonstrating significant promise in augmenting human productivity. Notably, large language models (LLM), with ChatGPT-4 as an example, have drawn considerable attention. Numerous articles have examined the impact of LLM-based tools on human productivity in lab settings and designed tasks or in observational studies. Despite recent advances, field experiments applying LLM-based tools in realistic settings are limited. This paper presents the findings of a field randomized controlled trial assessing the effectiveness of LLM-based tools in providing unmonitored support services for information retrieval.
翻译:近年来,生成式人工智能取得了重大进展,在提升人类生产力方面展现出巨大潜力。尤其以ChatGPT-4为代表的大语言模型(LLM)引起了广泛关注。已有大量文献在实验室环境和规定任务中,或通过观察性研究,探讨了基于LLM的工具对人类生产力的影响。然而,尽管近期取得了进展,但在实际场景中应用基于LLM工具的现场实验仍然有限。本文报告了一项现场随机对照试验的研究结果,旨在评估基于LLM的工具在提供无人监督的信息检索支持服务方面的有效性。