Probabilistic and Semantic Descriptions of Image Manifolds and Their Applications

This paper begins with a description of methods for estimating probability density functions for images that reflects the observation that such data is usually constrained to lie in restricted regions of the high-dimensional image space - not every pattern of pixels is an image. It is common to say that images lie on a lower-dimensional manifold in the high-dimensional space. However, although images may lie on such lower-dimensional manifolds, it is not the case that all points on the manifold have an equal probability of being images. Images are unevenly distributed on the manifold, and our task is to devise ways to model this distribution as a probability distribution. In pursuing this goal, we consider generative models that are popular in AI and computer vision community. For our purposes, generative/probabilistic models should have the properties of 1) sample generation: it should be possible to sample from this distribution according to the modelled density function, and 2) probability computation: given a previously unseen sample from the dataset of interest, one should be able to compute the probability of the sample, at least up to a normalising constant. To this end, we investigate the use of methods such as normalising flow and diffusion models. We then show that such probabilistic descriptions can be used to construct defences against adversarial attacks. In addition to describing the manifold in terms of density, we also consider how semantic interpretations can be used to describe points on the manifold. To this end, we consider an emergent language framework which makes use of variational encoders to produce a disentangled representation of points that reside on a given manifold. Trajectories between points on a manifold can then be described in terms of evolving semantic descriptions.

翻译：本文首先描述了估计图像概率密度函数的方法，这些方法反映了这类数据通常被约束在高维图像空间的有限区域中的观察结果——并非所有像素模式都能构成图像。普遍认为图像位于高维空间中的低维流形上。然而，尽管图像可能位于此类低维流形上，但流形上的所有点并不具有相等的图像出现概率。图像在流形上分布不均匀，我们的任务是设计方法将其分布建模为概率分布。为实现这一目标，我们考虑了人工智能和计算机视觉领域流行的生成模型。根据我们的需求，生成/概率模型应具备以下特性：1）样本生成：应能根据建模的密度函数从该分布中采样；2）概率计算：针对数据集中的未见样本，应能计算其概率（至少可计算至归一化常数）。为此，我们研究了标准化流和扩散模型等方法的应用。随后展示此类概率描述如何用于构建对抗攻击的防御机制。除了从密度角度描述流形，我们还探讨如何利用语义解释描述流形上的点。为此，我们采用了一种新兴语言框架，该框架利用变分编码器对给定流形上的点产生解耦表示。流形上点之间的轨迹可以通过不断演化的语义描述加以刻画。