Question answering represents a core capability of large language models (LLMs). However, when individuals encounter unfamiliar knowledge in texts, they often formulate questions that the text itself cannot answer due to insufficient understanding of the underlying information. Recent studies reveal that while LLMs can detect unanswerable questions, they struggle to assist users in reformulating these questions. Even advanced models like GPT-3.5 demonstrate limited effectiveness in this regard. To address this limitation, we propose DRS: Deep Question Reformulation with Structured Output, a novel zero-shot method aimed at enhancing LLMs ability to assist users in reformulating questions to extract relevant information from new documents. DRS combines the strengths of LLMs with a DFS-based algorithm to iteratively explore potential entity combinations and constrain outputs using predefined entities. This structured approach significantly enhances the reformulation capabilities of LLMs. Comprehensive experimental evaluations demonstrate that DRS improves the reformulation accuracy of GPT-3.5 from 23.03% to 70.42%, while also enhancing the performance of open-source models, such as Gemma2-9B, from 26.35% to 56.75%.
翻译:问答能力代表了大型语言模型(LLM)的核心能力。然而,当人们在文本中遇到不熟悉的知识时,由于对底层信息理解不足,常常会提出文本本身无法回答的问题。近期研究表明,尽管LLM能够检测出不可回答的问题,但在协助用户重述这些问题方面仍存在困难。即使是GPT-3.5等先进模型,在此方面的表现也相当有限。为突破这一局限,我们提出DRS:基于结构化输出的深度问题重述方法——一种旨在提升LLM协助用户重述问题以从新文档中提取相关信息能力的零样本方法。DRS结合了LLM的优势与基于DFS的算法,通过迭代探索潜在实体组合,并利用预定义实体约束输出。这种结构化方法显著增强了LLM的重述能力。综合实验评估表明,DRS将GPT-3.5的重述准确率从23.03%提升至70.42%,同时将Gemma2-9B等开源模型的性能从26.35%提升至56.75%。