Ensuring large language model (LLM) reliability requires distinguishing objective unsolvability (inherent contradictions) from subjective capability limitations (tasks exceeding model competence). Current LLMs often conflate these dimensions, leading to hallucinations in which they return confident answers to inherently unsolvable queries. To address this issue, we propose a multi-domain dataset containing both solvable and unsolvable questions, UnsolvableQA, together with an alignment framework, UnsolvableRL. First, we construct UnsolvableQA by "Reverse Construction" that systematically injects logical contradictions into otherwise valid reasoning chains. Second, we introduce UnsolvableRL, a reinforcement learning paradigm that balances objective unsolvability detection with calibrated confidence under capability limits. Empirically, our approach achieves near-perfect unsolvability detection (>90% detection rate) and boosts solvable reasoning accuracy from 43.4% to 69.4% on Qwen3-4B-Instruct. Crucially, we identify a data-training interaction: strict alignment constraints induce Capability Collapse without unsolvable data, but act as a regularizer for rigor when such data are included, thereby improving overall robustness. Our code and data are available at https://github.com/sfasfaffa/unsolvableQA .
翻译:确保大型语言模型(LLM)的可靠性需要区分客观不可解性(内在矛盾)与主观能力局限(超出模型胜任范围的任务)。当前LLM常混淆这两个维度,导致产生幻觉——即对本质上不可解的查询返回确信答案。为解决此问题,我们提出了包含可解与不可解问题的多领域数据集UnsolvableQA,以及对齐框架UnsolvableRL。首先,我们通过“逆向构建”方法创建UnsolvableQA,该方法将逻辑矛盾系统性地注入原本有效的推理链中。其次,我们提出UnsolvableRL——一种强化学习范式,可在能力限制下平衡客观不可解性检测与校准置信度。实验表明,我们的方法实现了接近完美的不可解性检测(检测率>90%),并将Qwen3-4B-Instruct模型在可解问题上的推理准确率从43.4%提升至69.4%。关键的是,我们发现了数据与训练的交互效应:严格的对齐约束在缺乏不可解数据时会引发能力崩溃,但当包含此类数据时则成为严谨性的正则化器,从而提升整体鲁棒性。我们的代码与数据公开于https://github.com/sfasfaffa/unsolvableQA。