TC-VAE: Uncovering Out-of-Distribution Data Generative Factors

Uncovering data generative factors is the ultimate goal of disentanglement learning. Although many works proposed disentangling generative models able to uncover the underlying generative factors of a dataset, so far no one was able to uncover OOD generative factors (i.e., factors of variations that are not explicitly shown on the dataset). Moreover, the datasets used to validate these models are synthetically generated using a balanced mixture of some predefined generative factors, implicitly assuming that generative factors are uniformly distributed across the datasets. However, real datasets do not present this property. In this work we analyse the effect of using datasets with unbalanced generative factors, providing qualitative and quantitative results for widely used generative models. Moreover, we propose TC-VAE, a generative model optimized using a lower bound of the joint total correlation between the learned latent representations and the input data. We show that the proposed model is able to uncover OOD generative factors on different datasets and outperforms on average the related baselines in terms of downstream disentanglement metrics.

翻译：揭示数据生成因子是解耦学习的终极目标。尽管已有许多工作提出能揭示数据集潜在生成因子的解耦生成模型，但目前尚无方法能揭示分布外（OOD）生成因子（即未在数据集中明确展示的变异因子）。此外，用于验证这些模型的数据集通常由若干预定义生成因子的平衡混合合成生成，隐含假设生成因子在数据集中均匀分布。然而，真实数据集并不具备这一特性。本文分析了使用非平衡生成因子数据集的影响，为广泛使用的生成模型提供了定性与定量结果。此外，我们提出TC-VAE，一种通过优化学习到的隐表示与输入数据之间联合总相关下界而优化的生成模型。实验表明，所提模型能在不同数据集上揭示OOD生成因子，并在下游解耦指标上平均优于相关基线方法。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【NUS-Xavier教授】生成模型VAE与GAN，69页ppt

专知会员服务

75+阅读 · 2022年4月6日

【ICLR 2022】MIT论文解读：谈到人工智能，我们可以抛弃数据集吗？基于ML创建合成数据，Generative Models As A Data Source For Multiview Representation Learning

专知会员服务

41+阅读 · 2022年3月15日

生成式对抗网络异常检测，GANs for Anomaly Detection

专知会员服务

34+阅读 · 2021年9月16日

分布外泛化(Out-Of-Distribution Generalization) 综述论文，22页pdf240篇文献

专知会员服务

64+阅读 · 2021年9月2日