In the field of natural language processing, the rapid development of large language model (LLM) has attracted more and more attention. LLMs have shown a high level of creativity in various tasks, but the methods for assessing such creativity are inadequate. The assessment of LLM creativity needs to consider differences from humans, requiring multi-dimensional measurement while balancing accuracy and efficiency. This paper aims to establish an efficient framework for assessing the level of creativity in LLMs. By adapting the modified Torrance Tests of Creative Thinking, the research evaluates the creative performance of various LLMs across 7 tasks, emphasizing 4 criteria including Fluency, Flexibility, Originality, and Elaboration. In this context, we develop a comprehensive dataset of 700 questions for testing and an LLM-based evaluation method. In addition, this study presents a novel analysis of LLMs' responses to diverse prompts and role-play situations. We found that the creativity of LLMs primarily falls short in originality, while excelling in elaboration. Besides, the use of prompts and the role-play settings of the model significantly influence creativity. Additionally, the experimental results also indicate that collaboration among multiple LLMs can enhance originality. Notably, our findings reveal a consensus between human evaluations and LLMs regarding the personality traits that influence creativity. The findings underscore the significant impact of LLM design on creativity and bridges artificial intelligence and human creativity, offering insights into LLMs' creativity and potential applications.
翻译:在自然语言处理领域,大型语言模型(LLM)的快速发展引起了越来越多的关注。LLM在各种任务中展现出高水平的创造力,但评估这种创造力的方法尚不充分。LLM创造力的评估需要考虑与人类的差异,需要在多维度测量的同时平衡准确性和效率。本文旨在建立一个高效的框架来评估LLM的创造力水平。通过改编版托伦斯创造性思维测试,研究评估了多种LLM在7项任务上的创造性表现,强调流畅性、灵活性、原创性和精细化四个标准。在此背景下,我们开发了一个包含700个问题的综合测试数据集,并提出了一种基于LLM的评估方法。此外,本研究还对LLM对不同提示和角色扮演情境的响应进行了新颖分析。我们发现LLM的创造力主要在原创性方面有所欠缺,而在精细化方面表现出色。此外,提示的使用和模型的角色扮演设置显著影响创造力。实验结果表明,多个LLM之间的协作可以增强原创性。值得注意的是,我们的研究揭示了人类评估与LLM在影响创造力的人格特质方面存在一致性。这些发现强调了LLM设计对创造力的重要影响,并连接了人工智能与人类创造力,为LLM的创造力及其潜在应用提供了见解。