Variational regularization is a classical technique to solve statistical inference tasks and inverse problems, with modern data-driven approaches parameterizing regularizers via deep neural networks showcasing impressive empirical performance. Recent works along these lines learn task-dependent regularizers. This is done by integrating information about the measurements and ground-truth data in an unsupervised, critic-based loss function, where the regularizer attributes low values to likely data and high values to unlikely data. However, there is little theory about the structure of regularizers learned via this process and how it relates to the two data distributions. To make progress on this challenge, we initiate a study of optimizing critic-based loss functions to learn regularizers over a particular family of regularizers: gauges (or Minkowski functionals) of star-shaped bodies. This family contains regularizers that are commonly employed in practice and shares properties with regularizers parameterized by deep neural networks. We specifically investigate critic-based losses derived from variational representations of statistical distances between probability measures. By leveraging tools from star geometry and dual Brunn-Minkowski theory, we illustrate how these losses can be interpreted as dual mixed volumes that depend on the data distribution. This allows us to derive exact expressions for the optimal regularizer in certain cases. Finally, we identify which neural network architectures give rise to such star body gauges and when do such regularizers have favorable properties for optimization. More broadly, this work highlights how the tools of star geometry can aid in understanding the geometry of unsupervised regularizer learning.
翻译:变分正则化是解决统计推断任务和逆问题的经典技术,现代数据驱动方法通过深度神经网络参数化正则化器,展现出令人印象深刻的经验性能。沿此方向的最新研究学习任务依赖的正则化器。这是通过将测量信息和真实数据信息整合到一个无监督的、基于批评器的损失函数中实现的,其中正则化器对可能数据赋予低值,对不可能数据赋予高值。然而,关于通过此过程学习的正则化器结构及其与两个数据分布的关系,现有理论甚少。为应对这一挑战,我们启动了一项研究,旨在优化基于批评器的损失函数,以学习特定正则化器族上的正则化器:星形体的度规(或闵可夫斯基泛函)。该族包含实践中常用的正则化器,并与深度神经网络参数化的正则化器共享某些性质。我们特别研究了从概率测度间统计距离的变分表示导出的基于批评器的损失。通过利用星形几何和对偶Brunn-Minkowski理论的工具,我们阐释了这些损失如何可被解释为依赖于数据分布的对偶混合体积。这使我们能够在某些情况下推导出最优正则化器的精确表达式。最后,我们识别了哪些神经网络架构会产生此类星形体度规,以及此类正则化器何时具有对优化有利的性质。更广泛而言,这项工作凸显了星形几何工具如何有助于理解无监督正则化器学习的几何结构。