在自动提示排序与生成任务中使用大语言模型 (Using Large Language Models in Automatic Hint Ranking and Generation Tasks)

The use of Large Language Models (LLMs) has increased significantly recently, with individuals frequently interacting with chatbots to receive answers to a wide range of questions. In an era where information is readily accessible, it is crucial to stimulate and preserve human cognitive abilities and maintain strong reasoning skills. This paper addresses such challenges by promoting the use of hints as an alternative or a supplement to direct answers. We first introduce a manually constructed hint dataset, WIKIHINT, which includes 5,000 hints created for 1,000 questions. We then finetune open-source LLMs such as LLaMA-3.1 for hint generation in answer-aware and answer-agnostic contexts. We assess the effectiveness of the hints with human participants who try to answer questions with and without the aid of hints. Additionally, we introduce a lightweight evaluation method, HINTRANK, to evaluate and rank hints in both answer-aware and answer-agnostic settings. Our findings show that (a) the dataset helps generate more effective hints, (b) including answer information along with questions generally improves hint quality, and (c) encoder-based models perform better than decoder-based models in hint ranking.

翻译：近年来，大语言模型（LLMs）的使用显著增加，人们经常通过与聊天机器人交互来获取各类问题的答案。在信息唾手可得的时代，激发并保持人类的认知能力、维持强大的推理技能至关重要。本文通过倡导使用提示作为直接答案的替代或补充，以应对此类挑战。我们首先引入了一个人工构建的提示数据集 WIKIHINT，该数据集包含为 1000 个问题创建的 5000 条提示。随后，我们对 LLaMA-3.1 等开源大语言模型进行微调，使其能够在答案已知和答案未知的语境下生成提示。我们通过让人类参与者在有提示和无提示辅助的情况下尝试回答问题，来评估提示的有效性。此外，我们引入了一种轻量级的评估方法 HINTRANK，用于在答案已知和答案未知两种设置下评估并排序提示。我们的研究结果表明：（a）该数据集有助于生成更有效的提示；（b）在问题中包含答案信息通常能提升提示质量；（c）在提示排序任务中，基于编码器的模型表现优于基于解码器的模型。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日