Uncertainty calibration is crucial for various machine learning applications, yet it remains challenging. Many models exhibit hallucinations - confident yet inaccurate responses - due to miscalibrated confidence. Here, we show that the common practice of random initialization in deep learning, often considered a standard technique, is an underlying cause of this miscalibration, leading to excessively high confidence in untrained networks. Our method, inspired by developmental neuroscience, addresses this issue by simply pretraining networks with random noise and labels, reducing overconfidence and bringing initial confidence levels closer to chance. This ensures optimal calibration, aligning confidence with accuracy during subsequent data training, without the need for additional pre- or post-processing. Pre-calibrated networks excel at identifying "unknown data," showing low confidence for out-of-distribution inputs, thereby resolving confidence miscalibration.
翻译:不确定性校准对于各类机器学习应用至关重要,但其实现仍具挑战性。由于置信度校准不当,许多模型存在幻觉现象——即表现出自信却错误的响应。本文揭示,深度学习中被视为标准技术的随机初始化操作,正是这种校准失准的根源,它导致未经训练的网络呈现出过度自信。受发育神经科学启发,我们提出的方法通过使用随机噪声和标签对网络进行简单预训练来解决该问题,从而降低过度自信并使初始置信水平更接近随机猜测。这种方法确保了最优校准,使后续数据训练过程中的置信度与准确率保持一致,且无需任何额外的预处理或后处理步骤。经过预校准的网络在识别"未知数据"方面表现卓越,对分布外输入表现出较低的置信度,从而有效解决了置信度校准失准问题。