Recently, combinations of generative and Bayesian machine learning have been introduced in particle physics for both fast detector simulation and inference tasks. These neural networks aim to quantify the uncertainty on the generated distribution originating from limited training statistics. The interpretation of a distribution-wide uncertainty however remains ill-defined. We show a clear scheme for quantifying the calibration of Bayesian generative machine learning models. For a Continuous Normalizing Flow applied to a low-dimensional toy example, we evaluate the calibration of Bayesian uncertainties from either a mean-field Gaussian weight posterior, or Monte Carlo sampling network weights, to gauge their behaviour on unsteady distribution edges. Well calibrated uncertainties can then be used to roughly estimate the number of uncorrelated truth samples that are equivalent to the generated sample and clearly indicate data amplification for smooth features of the distribution.
翻译:近年来,生成式与贝叶斯机器学习相结合的方法已被引入粒子物理学领域,用于快速探测器模拟和推断任务。这些神经网络旨在量化因有限训练统计量而导致的生成分布不确定性。然而,分布范围不确定性的解释仍缺乏明确定义。我们提出了一种清晰的方案,用于量化贝叶斯生成式机器学习模型的校准性能。通过将连续归一化流应用于低维玩具示例,我们评估了基于均值场高斯权重后验或蒙特卡洛采样网络权重的贝叶斯不确定性校准效果,以考察其在非稳态分布边缘的行为特性。良好校准的不确定性可用于粗略估计与生成样本等效的独立真实样本数量,并清晰表明分布平滑特征的数据放大效应。