Unifying supervised learning and VAEs -- coverage, systematics and goodness-of-fit in normalizing-flow based neural network models for astro-particle reconstructions

Networking · Neural Networks · 覆盖 · MoDELS · Learning ·

2023 年 10 月 3 日

翻译：统一监督学习与VAE——基于归一化流的神经网络的覆盖、系统误差与拟合优度在星体粒子重建中的应用

Thorsten Glüsenkamp

Neural-network based predictions of event properties in astro-particle physics are getting more and more common. However, in many cases the result is just utilized as a point prediction. Statistical uncertainties and coverage (1), systematic uncertainties (2) or a goodness-of-fit measure (3) are often not calculated. Here we describe a certain choice of training and network architecture that allows to incorporate all these properties into a single network model. We show that a KL-divergence objective of the joint distribution of data and labels allows to unify supervised learning and variational autoencoders (VAEs) under one umbrella of stochastic variational inference. The unification motivates an extended supervised learning scheme which allows to calculate a goodness-of-fit p-value for the neural network model. Conditional normalizing flows amortized with a neural network are crucial in this construction. We discuss how they allow to rigorously define coverage for posteriors defined jointly on a product space, e.g. $\mathbb{R}^n \times \mathcal{S}^m$, which encompasses posteriors over directions. Finally, systematic uncertainties are naturally included in the variational viewpoint. The proposed extended supervised training with amortized normalizing flows incorporates (1) coverage calculation, (2) systematics and (3) a goodness-of-fit measure in a single machine-learning model. There are no constraints on the shape of the involved distributions (e.g. Gaussianity) for these properties to hold, in fact it works with complex multi-modal distributions defined on product spaces like $\mathbb{R}^n \times \mathcal{S}^m$. We see great potential for exploiting this per-event information in event selections or for fast astronomical alerts which require uncertainty guarantees.

翻译：基于神经网络的天体粒子物理事件属性预测正变得越来越普遍。然而，许多情况下，结果仅作为点预测使用。统计不确定性与覆盖(1)、系统误差(2)或拟合优度度量(3)通常未被计算。本文描述了特定的训练与网络架构选择，使所有这些特性能够整合到单一网络模型中。我们证明，数据与标签联合分布的KL散度目标函数能够将监督学习与变分自编码器(VAEs)统一在随机变分推理的框架下。这一统一性催生了一种扩展的监督学习方案，可计算神经网络模型的拟合优度p值。基于神经网络摊销的条件归一化流在此构造中至关重要。我们讨论了它们如何严格定义乘积空间（例如$\mathbb{R}^n \times \mathcal{S}^m$）上联合后验的覆盖，这覆盖了方向的后验分布。最后，系统误差自然被纳入变分视角。所提出的基于摊销归一化流的扩展监督训练将(1)覆盖计算、(2)系统误差与(3)拟合优度度量整合到单一机器学习模型中。这些特性的成立对涉及分布的形状（如高斯性）没有约束，实际上它适用于定义在乘积空间（如$\mathbb{R}^n \times \mathcal{S}^m$）上的复杂多峰分布。我们相信，在需要不确定性保证的事件选择或快速天文警报中，利用这种逐事件信息具有巨大潜力。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【AI应用】Facebook-利用神经网络求解高等数学方程, Using neural networks to solve advanced mathematics equations

专知会员服务

34+阅读 · 2020年1月15日