Recent advances in generative AI have led to increasingly realistic synthetic data, yet evaluation criteria remain focused on marginal distribution matching. While these diagnostics assess local realism, they provide limited insight into whether a generative model preserves the multivariate dependence structures governing downstream inference. We introduce covariance-level dependence fidelity as a practical criterion for evaluating whether a generative distribution preserves joint structure beyond univariate marginals. We establish three core results. First, distributions can match all univariate marginals exactly while exhibiting substantially different dependence structures, demonstrating marginal fidelity alone is insufficient. Second, dependence divergence induces quantitative instability in downstream inference, including sign reversals in regression coefficients despite identical marginal behavior. Third, explicit control of covariance-level dependence divergence ensures stable behavior for dependence-sensitive tasks such as principal component analysis. Synthetic constructions illustrate how dependence preservation failures lead to incorrect conclusions despite identical marginal distributions. These results highlight dependence fidelity as a useful diagnostic for evaluating generative models in dependence-sensitive downstream tasks, with implications for diffusion models and variational autoencoders. These guarantees apply specifically to procedures governed by covariance structure; tasks requiring higher-order dependence such as tail-event estimation require richer criteria.
翻译:生成式人工智能的最新进展催生了日益逼真的合成数据,然而评估标准仍主要集中于边缘分布匹配。尽管这些诊断方法能够评估局部真实性,但它们对于生成模型是否保留了支配下游推断的多元依赖结构所提供的洞察有限。我们引入协方差级依赖保真度作为一个实用标准,用于评估生成分布是否在单变量边缘分布之外保留了联合结构。我们建立了三个核心结果。首先,分布可以在精确匹配所有单变量边缘分布的同时,展现出显著不同的依赖结构,这表明仅凭边缘保真度是不充分的。其次,依赖散度会引发下游推断的定量不稳定性,包括回归系数的符号反转,尽管边缘行为完全相同。第三,对协方差级依赖散度的显式控制能够确保依赖敏感任务(如主成分分析)的稳定行为。合成构造示例说明了即使边缘分布完全相同,依赖保持的失败如何导致错误结论。这些结果凸显了依赖保真度作为评估生成模型在依赖敏感下游任务中表现的有用诊断工具,对扩散模型和变分自编码器具有启示意义。这些保证特别适用于由协方差结构支配的流程;需要更高阶依赖的任务(如尾部事件估计)则需要更丰富的标准。