ALIEN: Aligned Entropy Head for Improving Uncertainty Estimation of LLMs

Uncertainty estimation remains a key challenge when adapting pre-trained language models to downstream classification tasks, with overconfidence often observed for difficult inputs. While predictive entropy provides a strong baseline for uncertainty estimation, it considers mainly aleatoric uncertainty and has limited capacity to capture effects, such as class overlap or ambiguous linguistic cues. We introduce Aligned Entropy - ALIEN, a lightweight method that refines entropy-based uncertainty by aligning it with prediction reliability. ALIEN trains a small uncertainty head initialized to produce the model's original entropy and subsequently fine-tuned with two regularization mechanisms. Experiments across seven classification datasets and two NER benchmarks, evaluated on five language models (RoBERTa, ELECTRA, LLaMA-2, Qwen2.5, and Qwen3), show that ALIEN consistently outperforms strong baselines across all considered scenarios in detecting incorrect predictions, while achieving the lowest calibration error. The proposed method introduces only a small inference overhead (in the order of milliseconds per batch on CPU) and increases the model's parameter count by just 0.002% for decoder models and 0.5% for encoder models, without requiring storage of intermediate states. It improves uncertainty estimation while preserving the original model architecture, making the approach practical for large-scale deployment with modern language models. Our results demonstrate that entropy can be effectively refined through lightweight supervised alignment, producing more reliable uncertainty estimates without modifying the backbone model. The code is available at 4.

翻译：不确定性估计仍是预训练语言模型适配下游分类任务时面临的关键挑战，模型常对困难输入表现出过度自信。尽管预测熵为不确定性估计提供了强基线，但其主要考虑偶然不确定性，难以有效刻画类别重叠或模糊语言线索等效应。我们提出对齐熵（ALIEN）——一种轻量化方法，通过将基于熵的不确定性与预测可靠性对齐来改进不确定性估计。ALIEN训练一个轻量级不确定性头，该头初始化为生成模型原始熵，并通过两种正则化机制进行微调。在七个分类数据集和两个命名实体识别基准上的实验表明，基于五种语言模型（RoBERTa、ELECTRA、LLaMA-2、Qwen2.5和Qwen3）的评估中，ALIEN在检测错误预测方面持续优于所有场景下的强基线，同时实现了最低的标定误差。所提方法仅引入极小的推理开销（在CPU上每批毫秒级），且仅使解码器模型参数增加0.002%、编码器模型参数增加0.5%，无需存储中间状态。该方法在保持原始模型架构的同时改进了不确定性估计，使其适用于现代语言模型的大规模部署。我们的结果表明，通过轻量级监督对齐可有效改进熵，在不修改骨干模型的情况下产生更可靠的不确定性估计。代码已开源。