Probabilistic and Semantic Descriptions of Image Manifolds and Their Applications

This paper begins with a description of methods for estimating probability density functions for images that reflects the observation that such data is usually constrained to lie in restricted regions of the high-dimensional image space - not every pattern of pixels is an image. It is common to say that images lie on a lower-dimensional manifold in the high-dimensional space. However, although images may lie on such lower-dimensional manifolds, it is not the case that all points on the manifold have an equal probability of being images. Images are unevenly distributed on the manifold, and our task is to devise ways to model this distribution as a probability distribution. In pursuing this goal, we consider generative models that are popular in AI and computer vision community. For our purposes, generative/probabilistic models should have the properties of 1) sample generation: it should be possible to sample from this distribution according to the modelled density function, and 2) probability computation: given a previously unseen sample from the dataset of interest, one should be able to compute the probability of the sample, at least up to a normalising constant. To this end, we investigate the use of methods such as normalising flow and diffusion models. We then show that such probabilistic descriptions can be used to construct defences against adversarial attacks. In addition to describing the manifold in terms of density, we also consider how semantic interpretations can be used to describe points on the manifold. To this end, we consider an emergent language framework which makes use of variational encoders to produce a disentangled representation of points that reside on a given manifold. Trajectories between points on a manifold can then be described in terms of evolving semantic descriptions.

翻译：本文首先描述了估计图像概率密度函数的方法，该方法反映了观测结果：此类数据通常被约束在高维图像空间的有限区域内——并非所有像素模式都能构成图像。通常认为图像位于高维空间中的低维流形上。然而，尽管图像可能分布在这些低维流形上，但流形上所有点具有相同图像概率的情况并不成立。图像在流形上的分布是不均匀的，我们的任务是设计方法将这种分布建模为概率分布。为实现这一目标，我们考虑了人工智能与计算机视觉领域流行的生成模型。就本文目的而言，生成/概率模型应具备以下特性：1）样本生成：应能根据建模的密度函数从该分布中进行采样；2）概率计算：对于给定数据集中未见过的新样本，至少应能在标准化常数范围内计算其概率。为此，我们研究了归一化流和扩散模型等方法的适用性。进而证明此类概率描述可用于构建对抗攻击防御机制。除了从密度角度描述流形，我们还探讨了如何利用语义解释来描述流形上的点。为此，我们采用基于变分编码器的涌现语言框架，该框架能产生给定流形上点的解耦表示。由此，流形上点之间的轨迹可通过不断演化的语义描述来表征。