Nonlinear independent component analysis (ICA) aims to recover the underlying independent latent sources from their observable nonlinear mixtures. How to make the nonlinear ICA model identifiable up to certain trivial indeterminacies is a long-standing problem in unsupervised learning. Recent breakthroughs reformulate the standard independence assumption of sources as conditional independence given some auxiliary variables (e.g., class labels and/or domain/time indexes) as weak supervision or inductive bias. However, nonlinear ICA with unconditional priors cannot benefit from such developments. We explore an alternative path and consider only assumptions on the mixing process, such as Structural Sparsity. We show that under specific instantiations of such constraints, the independent latent sources can be identified from their nonlinear mixtures up to a permutation and a component-wise transformation, thus achieving nontrivial identifiability of nonlinear ICA without auxiliary variables. We provide estimation methods and validate the theoretical results experimentally. The results on image data suggest that our conditions may hold in a number of practical data generating processes.
翻译:非线性独立成分分析(ICA)旨在从可观测的非线性混合中恢复潜在的独立源信号。如何使非线性ICA模型在仅存在特定平凡不确定性时具有可辨识性,是无监督学习中长期存在的问题。近期突破性进展将源的经典独立性假设重新表述为在给定某些辅助变量(例如类别标签和/或领域/时间索引)条件下的条件独立性,并将其作为弱监督或归纳偏置。然而,基于无条件先验的非线性ICA无法受益于这类发展。我们探索另一条路径,仅考虑混合过程中的假设,例如结构稀疏性。我们证明,在此类约束的特定实例化下,独立潜在源可从其非线性混合中识别,仅存在排列和分量级变换的不确定性,从而在不依赖辅助变量的情况下实现非线性ICA的非平凡可辨识性。我们提出了估计方法,并通过实验验证了理论结果。图像数据的实验表明,我们的条件可能在实际数据生成过程中成立。