A Likelihood Based Approach to Distribution Regression Using Conditional Deep Generative Models

In this work, we explore the theoretical properties of conditional deep generative models under the statistical framework of distribution regression where the response variable lies in a high-dimensional ambient space but concentrates around a potentially lower-dimensional manifold. More specifically, we study the large-sample properties of a likelihood-based approach for estimating these models. Our results lead to the convergence rate of a sieve maximum likelihood estimator (MLE) for estimating the conditional distribution (and its devolved counterpart) of the response given predictors in the Hellinger (Wasserstein) metric. Our rates depend solely on the intrinsic dimension and smoothness of the true conditional distribution. These findings provide an explanation of why conditional deep generative models can circumvent the curse of dimensionality from the perspective of statistical foundations and demonstrate that they can learn a broader class of nearly singular conditional distributions. Our analysis also emphasizes the importance of introducing a small noise perturbation to the data when they are supported sufficiently close to a manifold. Finally, in our numerical studies, we demonstrate the effective implementation of the proposed approach using both synthetic and real-world datasets, which also provide complementary validation to our theoretical findings.

翻译：在本研究中，我们基于分布回归的统计框架，探讨了条件深度生成模型的理论性质，其中响应变量位于高维环境空间中，但集中在潜在的低维流形附近。具体而言，我们研究了基于似然的方法估计这些模型的大样本性质。我们的结果导出了筛极大似然估计量（MLE）在Hellinger（Wasserstein）度量下估计给定预测变量的条件分布（及其退化对应形式）的收敛速率。该速率仅取决于真实条件分布的内在维数和平滑度。这些发现从统计基础的角度解释了条件深度生成模型为何能够规避维数灾难，并证明它们能够学习更广泛的近似奇异条件分布类。我们的分析还强调了当数据支撑集充分接近流形时，对数据引入微小噪声扰动的重要性。最后，在数值研究中，我们通过合成数据集和真实数据集展示了所提方法的有效实现，这为我们的理论发现提供了补充验证。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/