We consider deep multivariate models for heterogeneous collections of random variables. In the context of computer vision, such collections may e.g. consist of images, segmentations, image attributes, and latent variables. When developing such models, most existing works start from an application task and design the model components and their dependencies to meet the needs of the chosen task. This has the disadvantage of limiting the applicability of the resulting model for other downstream tasks. Here, instead, we propose to represent the joint probability distribution by means of conditional probability distributions for each group of variables conditioned on the rest. Such models can then be used for practically any possible downstream task. Their learning can be approached as training a parametrised Markov chain kernel by maximising the data likelihood of its limiting distribution. This has the additional advantage of allowing a wide range of semi-supervised learning scenarios.
翻译:本文研究面向异构随机变量集合的深度多元模型。在计算机视觉领域,此类集合可能包含图像、分割结果、图像属性及隐变量等要素。现有研究在构建此类模型时,大多从特定应用任务出发,根据任务需求设计模型组件及其依赖关系,这种方法的局限性在于会制约模型在其他下游任务中的适用性。为此,我们提出通过条件概率分布来表征联合概率分布,即对每个变量组基于其余变量定义条件分布。由此构建的模型可适用于几乎所有可能的下游任务。其学习过程可转化为通过最大化极限分布的数据似然来训练参数化马尔可夫链核,这一方法还具有支持广泛半监督学习场景的附加优势。