Quriosity：通过好奇心驱动查询分析人类提问行为与因果探究 (Quriosity: Analyzing Human Questioning Behavior and Causal Inquiry through Curiosity-Driven Queries)

Recent progress in Large Language Model (LLM) technology has changed our role in interacting with these models. Instead of primarily testing these models with questions we already know answers to, we are now using them for queries where the answers are unknown to us, driven by human curiosity. This shift highlights the growing need to understand curiosity-driven human questions - those that are more complex, open-ended, and reflective of real-world needs. To this end, we present Quriosity, a collection of 13.5K naturally occurring questions from three diverse sources: human-to-search-engine queries, human-to-human interactions, and human-to-LLM conversations. Our comprehensive collection enables a rich understanding of human curiosity across various domains and contexts. Our analysis reveals a significant presence of causal questions (up to 42%) in the dataset, for which we develop an iterative prompt improvement framework to identify all causal queries and examine their unique linguistic properties, cognitive complexity and source distribution. Our paper paves the way for future work on causal question identification and open-ended chatbot interactions. Our code and data are at https://github.com/roberto-ceraolo/quriosity.

翻译：大型语言模型（LLM）技术的近期进展改变了我们与这些模型交互的角色定位。我们不再主要用已知答案的问题来测试模型，而是越来越多地出于人类好奇心驱动，向模型提出我们自身未知答案的查询。这一转变凸显了理解好奇心驱动的人类问题——那些更复杂、开放且反映真实世界需求的问题——日益增长的重要性。为此，我们提出了Quriosity，一个包含13.5K个自然产生问题的数据集，来源于三个多样化渠道：人类对搜索引擎的查询、人际交互对话以及人类与LLM的对话。我们的综合性数据集支持对不同领域和情境下人类好奇心的深入理解。分析表明，数据集中存在显著比例的因果性问题（高达42%）。为此，我们开发了一个迭代式提示改进框架，以识别所有因果查询，并考察其独特的语言特性、认知复杂度及来源分布。本研究为未来因果问题识别和开放域聊天机器人交互的相关工作奠定了基础。代码与数据公开于 https://github.com/roberto-ceraolo/quriosity。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/