Nonlinear independent component analysis (ICA) aims to recover the underlying independent latent sources from their observable nonlinear mixtures. How to make the nonlinear ICA model identifiable up to certain trivial indeterminacies is a long-standing problem in unsupervised learning. Recent breakthroughs reformulate the standard independence assumption of sources as conditional independence given some auxiliary variables (e.g., class labels and/or domain/time indexes) as weak supervision or inductive bias. However, nonlinear ICA with unconditional priors cannot benefit from such developments. We explore an alternative path and consider only assumptions on the mixing process, such as Structural Sparsity. We show that under specific instantiations of such constraints, the independent latent sources can be identified from their nonlinear mixtures up to a permutation and a component-wise transformation, thus achieving nontrivial identifiability of nonlinear ICA without auxiliary variables. We provide estimation methods and validate the theoretical results experimentally. The results on image data suggest that our conditions may hold in a number of practical data generating processes.
翻译:非线性独立成分分析(ICA)旨在从可观测的非线性混合中恢复潜在的独立源信号。如何使非线性ICA模型在特定平凡不确定性的条件下实现可辨识,是无监督学习中的一个长期难题。近期的突破将源信号的标准独立性假设重新表述为给定某些辅助变量(例如类别标签和/或领域/时间索引)的条件独立性,以此作为弱监督或归纳偏置。然而,具有无条件先验的非线性ICA无法受益于此类进展。我们探索了另一条路径,仅考虑对混合过程的假设,例如结构稀疏性。我们证明,在此类约束的具体实例化下,独立的潜在源信号能够从非线性混合中恢复(仅存在排列和分量级变换的不确定性),从而在无辅助变量的情况下实现非线性ICA的非平凡可辨识性。我们提供了估计方法,并通过实验验证了理论结果。图像数据上的实验表明,我们的条件可能适用于多种实际数据生成过程。