Recently, nonlinear ICA has surfaced as a popular alternative to the many heuristic models used in deep representation learning and disentanglement. An advantage of nonlinear ICA is that a sophisticated identifiability theory has been developed; in particular, it has been proven that the original components can be recovered under sufficiently strong latent dependencies. Despite this general theory, practical nonlinear ICA algorithms have so far been mainly limited to data with one-dimensional latent dependencies, especially time-series data. In this paper, we introduce a new nonlinear ICA framework that employs $t$-process (TP) latent components which apply naturally to data with higher-dimensional dependency structures, such as spatial and spatio-temporal data. In particular, we develop a new learning and inference algorithm that extends variational inference methods to handle the combination of a deep neural network mixing function with the TP prior, and employs the method of inducing points for computational efficacy. On the theoretical side, we show that such TP independent components are identifiable under very general conditions. Further, Gaussian Process (GP) nonlinear ICA is established as a limit of the TP Nonlinear ICA model, and we prove that the identifiability of the latent components at this GP limit is more restricted. Namely, those components are identifiable if and only if they have distinctly different covariance kernels. Our algorithm and identifiability theorems are explored on simulated spatial data and real world spatio-temporal data.
翻译:近年来,非线性独立成分分析(ICA)已成为深度表征学习和解缠领域诸多启发式模型的热门替代方案。其优势在于已建立起严密的可识别性理论——特别地,研究表明在足够强的潜在依赖条件下可恢复原始成分。尽管存在这一普适理论,但实用化的非线性ICA算法迄今主要局限于一维潜在依赖数据(尤以时间序列数据为主)。本文提出一种新型非线性ICA框架,采用t过程(TP)潜在成分,该框架可自然适用于具有高维依赖结构的数据(如空间数据和时空数据)。具体而言,我们开发了新的学习与推理算法:通过扩展变分推断方法处理深度神经网络混合函数与TP先验的组合,并采用诱导点方法提升计算效率。在理论层面,我们证明此类TP独立成分在极普遍条件下具有可识别性。进一步地,我们确立了高斯过程(GP)非线性ICA作为TP非线性ICA模型的极限情形,并证明该GP极限状态下潜在成分的可识别性存在更严格约束——即这些成分可被识别当且仅当它们具有显著不同的协方差核函数。通过模拟空间数据及真实世界时空数据,我们对所提算法与可识别性定理进行了实验验证。