This work introduces a novel principle for disentanglement we call mechanism sparsity regularization, which applies when the latent factors of interest depend sparsely on observed auxiliary variables and/or past latent factors. We propose a representation learning method that induces disentanglement by simultaneously learning the latent factors and the sparse causal graphical model that explains them. We develop a nonparametric identifiability theory that formalizes this principle and shows that the latent factors can be recovered by regularizing the learned causal graph to be sparse. More precisely, we show identifiablity up to a novel equivalence relation we call "consistency", which allows some latent factors to remain entangled (hence the term partial disentanglement). To describe the structure of this entanglement, we introduce the notions of entanglement graphs and graph preserving functions. We further provide a graphical criterion which guarantees complete disentanglement, that is identifiability up to permutations and element-wise transformations. We demonstrate the scope of the mechanism sparsity principle as well as the assumptions it relies on with several worked out examples. For instance, the framework shows how one can leverage multi-node interventions with unknown targets on the latent factors to disentangle them. We further draw connections between our nonparametric results and the now popular exponential family assumption. Lastly, we propose an estimation procedure based on variational autoencoders and a sparsity constraint and demonstrate it on various synthetic datasets. This work is meant to be a significantly extended version of Lachapelle et al. (2022).
翻译:本文提出了一种新的解耦原则,称为机制稀疏正则化,该原则适用于当感兴趣潜变量稀疏依赖于观测辅助变量和/或历史潜变量时。我们提出了一种表征学习方法,通过同时学习潜变量及其稀疏因果图模型来实现解耦。我们发展了一种非参数化可辨识理论,对该原则进行了形式化,并表明可以通过正则化学习得到的因果图结构使其稀疏,从而恢复潜变量。更精确地说,我们建立了一种新的等价关系——"一致性"下的可辨识性,该关系允许部分潜变量保持纠缠(故称为部分解耦)。为了描述这种纠缠的结构,我们引入了纠缠图和图保持函数的概念。我们进一步提供了一种确保完全解耦的图准则,即通过置换和元素级变换实现的可辨识性。我们通过多个具体示例展示了机制稀疏性原则的适用范围及其依赖的假设。例如,该框架展示了如何利用潜变量上未知目标的多节点干预来实现解耦。我们还建立了非参数化结果与当前流行的指数族假设之间的联系。最后,我们提出了一种基于变分自编码器和稀疏约束的估计方法,并在多种合成数据集上进行了验证。本文旨在作为Lachapelle等人(2022)工作的显著扩展版本。