The Gaussianity assumption has been consistently criticized as a main limitation of the Variational Autoencoder (VAE), despite its efficiency in computational modeling. In this paper, we propose a new approach that expands the model capacity (i.e., expressive power of distributional family) without sacrificing the computational advantages of the VAE framework. Our VAE model's decoder is composed of an infinite mixture of asymmetric Laplacian distribution, which possesses general distribution fitting capabilities for continuous variables. Our model is represented by a special form of a nonparametric M-estimator for estimating general quantile functions, and we theoretically establish the relevance between the proposed model and quantile estimation. We apply the proposed model to synthetic data generation, and particularly, our model demonstrates superiority in easily adjusting the level of data privacy.
翻译:高斯性假设一直是变分自编码器(VAE)的主要局限,尽管其在计算建模中具有高效性。本文提出一种新方法,可在不牺牲VAE框架计算优势的前提下扩展模型容量(即分布族的表达能力)。该VAE模型的解码器由非对称拉普拉斯分布的无限混合构成,具备对连续变量的通用分布拟合能力。该模型可表示为非参数M估计量用于估计一般分位数函数的特殊形式,我们从理论上建立了所提模型与分位数估计之间的关联性。将该模型应用于合成数据生成,结果表明其能便捷地调节数据隐私保护级别,展现出显著优越性。