Large language models have achieved impressive performance across diverse tasks. However, their tendency to produce overconfident and factually incorrect outputs, known as hallucinations, poses risks in real-world applications. Conformal prediction provides finite-sample, distribution-free coverage guarantees, but standard conformal prediction breaks down under domain shift, often leading to under-coverage and unreliable prediction sets. We propose a new framework called Domain-Shift-Aware Conformal Prediction (DS-CP). Our framework adapts conformal prediction to large language models under domain shift, by systematically reweighting calibration samples based on their proximity to the test prompt, thereby preserving validity while enhancing adaptivity. Our theoretical analysis and experiments on the MMLU benchmark demonstrate that the proposed method delivers more reliable coverage than standard conformal prediction, especially under substantial distribution shifts, while maintaining efficiency. This provides a practical step toward trustworthy uncertainty quantification for large language models in real-world deployment.
翻译:大型语言模型已在多种任务中展现出卓越性能。然而,其倾向于产生过度自信且事实错误的输出(即所谓的"幻觉"),这在实际应用中带来了风险。共形预测提供了有限样本、分布无关的覆盖保证,但标准共形预测在领域偏移下会失效,常常导致覆盖不足和不可靠的预测集。我们提出了一种名为"面向领域偏移的共形预测"(DS-CP)的新框架。该框架通过基于校准样本与测试提示的接近程度系统性地重新加权校准样本,将共形预测适配到存在领域偏移的大语言模型场景,从而在保持有效性的同时增强适应性。我们的理论分析及在MMLU基准上的实验表明,所提方法相较于标准共形预测能够提供更可靠的覆盖(尤其在显著分布偏移下),同时保持效率。这项工作为实际部署中大语言模型的可靠不确定性量化提供了实践性进展。