Nonlinear independent component analysis (ICA) aims to recover the underlying independent latent sources from their observable nonlinear mixtures. How to make the nonlinear ICA model identifiable up to certain trivial indeterminacies is a long-standing problem in unsupervised learning. Recent breakthroughs reformulate the standard independence assumption of sources as conditional independence given some auxiliary variables (e.g., class labels and/or domain/time indexes) as weak supervision or inductive bias. However, nonlinear ICA with unconditional priors cannot benefit from such developments. We explore an alternative path and consider only assumptions on the mixing process, such as Structural Sparsity. We show that under specific instantiations of such constraints, the independent latent sources can be identified from their nonlinear mixtures up to a permutation and a component-wise transformation, thus achieving nontrivial identifiability of nonlinear ICA without auxiliary variables. We provide estimation methods and validate the theoretical results experimentally. The results on image data suggest that our conditions may hold in a number of practical data generating processes.
翻译:非线性独立成分分析(ICA)旨在从可观测的非线性混合信号中恢复潜在的独立源信号。如何使非线性ICA模型在忽略某些平凡不确定性的前提下具有可辨识性,是无监督学习领域长期存在的问题。近期研究突破通过将源信号的经典独立性假设重新表述为给定某些辅助变量(例如类别标签和/或领域/时间索引)的条件独立性,从而引入弱监督或归纳偏置。然而,采用无条件先验的非线性ICA无法从这类进展中获益。我们探索另一条路径,仅对混合过程施加假设(例如结构稀疏性)。研究表明,在此类约束的具体实例化条件下,独立潜在源信号可通过其非线性混合信号被识别至置换和分量级变换的程度,从而在无辅助变量的情况下实现非线性ICA的非平凡可辨识性。我们提出相应的估计方法,并通过实验验证了理论结果。在图像数据上的实验表明,我们的条件可能适用于多种实际数据生成过程。