Deep latent generative models have attracted increasing attention due to the capacity of combining the strengths of deep learning and probabilistic models in an elegant way. The data representations learned with the models are often continuous and dense. However in many applications, sparse representations are expected, such as learning sparse high dimensional embedding of data in an unsupervised setting, and learning multi-labels from thousands of candidate tags in a supervised setting. In some scenarios, there could be further restriction on degree of sparsity: the number of non-zero features of a representation cannot be larger than a pre-defined threshold $L_0$. In this paper we propose a sparse deep latent generative model SDLGM to explicitly model degree of sparsity and thus enable to learn the sparse structure of the data with the quantified sparsity constraint. The resulting sparsity of a representation is not fixed, but fits to the observation itself under the pre-defined restriction. In particular, we introduce to each observation $i$ an auxiliary random variable $L_i$, which models the sparsity of its representation. The sparse representations are then generated with a two-step sampling process via two Gumbel-Softmax distributions. For inference and learning, we develop an amortized variational method based on MC gradient estimator. The resulting sparse representations are differentiable with backpropagation. The experimental evaluation on multiple datasets for unsupervised and supervised learning problems shows the benefits of the proposed method.
翻译:深度潜在生成模型因其以优雅方式结合深度学习与概率模型优势的能力而日益受到关注。该类模型学习到的数据表示通常是连续且稠密的,然而在许多应用中,稀疏表示更为理想,例如在无监督设置下学习数据的稀疏高维嵌入,以及在监督设置下从数千个候选标签中学习多标签。在某些场景中,稀疏度可能受到进一步限制:表示中非零特征的数量不能超过预定义的阈值$L_0$。本文提出一种稀疏深度潜在生成模型(SDLGM),用于显式建模稀疏度,从而能够在量化稀疏约束下学习数据的稀疏结构。由此产生的表示稀疏度并非固定不变,而是在预定义限制下自适应于观测数据本身。具体而言,我们为每个观测数据$i$引入一个辅助随机变量$L_i$,用以建模其表示的稀疏度。随后通过两个Gumbel-Softmax分布的两步采样过程生成稀疏表示。在推理与学习过程中,我们基于MC梯度估计器开发了一种摊销变分方法。所得到的稀疏表示可通过反向传播进行微分。在多个数据集上针对无监督与监督学习问题的实验评估表明了所提方法的有效性。