Answering open-ended questions remains challenging for AI systems because it requires synthesis, judgment, and exploration beyond factual retrieval, and users often refine answers through multiple iterations rather than accepting a single response. Existing QA benchmarks do not explicitly support this refinement process. To address this gap, we introduce a new task, document-grounded related insight generation, where the goal is to generate additional insights from a document collection that help improve, extend, or rethink an initial answer to an open-ended question, ultimately supporting richer user interaction and a better overall question answering experience. We curate and release SCOpE-QA (Scientific Collections for Open-Ended QA), a dataset of 3,000 open-ended questions across 20 research collections. We present InsightGen, a two-stage approach that first constructs a thematic representation of the document collection using clustering, and then selects related context based on neighborhood selection from the thematic graph to generate diverse and relevant insights using LLMs. Extensive evaluation on 3,000 questions using two generation models and two evaluation settings shows that InsightGen consistently produces useful, relevant, and actionable insights, establishing a strong baseline for this new task.
翻译:回答开放式问题对人工智能系统仍是一个挑战,因为这需要超越事实检索的综合、判断和探索能力,而用户通常通过多次迭代来完善答案,而非接受单一回答。现有的问答基准并不明确支持这种完善过程。为弥补这一空白,我们提出了一项新任务:基于文档的相关见解生成。其目标是从文档集合中生成额外见解,帮助改进、扩展或重新思考开放式问题的初始答案,从而最终支持更丰富的用户交互和更好的整体问答体验。我们整理并发布了SCOpE-QA(开放问答科学文献集),这是一个包含20个研究文献集合、3000个开放式问题的数据集。我们提出了InsightGen,一种两阶段方法:首先通过聚类构建文档集合的主题表征,然后基于主题图进行邻域选择以提取相关上下文,并利用大语言模型生成多样且相关的见解。在3000个问题上使用两种生成模型和两种评估设置进行的广泛评估表明,InsightGen能持续生成有用、相关且可操作的见解,为该新任务建立了强基线。