Large language models have achieved impressive performance across diverse tasks. However, their tendency to produce overconfident and factually incorrect outputs, known as hallucinations, poses risks in real world applications. Conformal prediction provides finite-sample, distribution-free coverage guarantees, but standard conformal prediction breaks down under domain shift, often leading to under-coverage and unreliable prediction sets. We propose a new framework called Domain-Shift-Aware Conformal Prediction (DS-CP). Our framework adapts conformal prediction to large language models under domain shift, by systematically reweighting calibration samples based on their proximity to the test prompt, thereby preserving validity while enhancing adaptivity. Our theoretical analysis and experiments on the MMLU benchmark demonstrate that the proposed method delivers more reliable coverage than standard conformal prediction, especially under substantial distribution shifts, while maintaining efficiency. This provides a practical step toward trustworthy uncertainty quantification for large language models in real-world deployment.
翻译:大型语言模型已在多样化任务中展现出卓越性能。然而,其倾向于生成过度自信且事实错误的输出(即幻觉问题),在实际应用中构成风险。保形预测能够提供有限样本且无需分布假设的覆盖保证,但标准保形预测方法在领域偏移下会失效,常导致覆盖不足和不可靠的预测集。本文提出一种名为域移感知保形预测的新框架。该框架通过系统性地依据校准样本与测试提示的邻近度进行重加权,使保形预测适应领域偏移下的大型语言模型,从而在保持有效性的同时增强适应性。我们在MMLU基准上的理论分析与实验表明,所提方法相较于标准保形预测能提供更可靠的覆盖保证,尤其在显著分布偏移下表现优异,同时保持计算效率。这为实际部署中大型语言模型的可信不确定性量化迈出了实用的一步。