This paper explores the interplay between statistics and generative artificial intelligence. Generative statistics, an integral part of the latter, aims to construct models that can {\it generate} efficiently and meaningfully new data across the whole of the (usually high dimensional) sample space, e.g. a new photo. Within it, the gradient-based approach is a current favourite that exploits effectively, for the above purpose, the information contained in the observed sample, e.g. an old photo. However, often there are missing data in the observed sample, e.g. missing bits in the old photo. To handle this situation, we have proposed a gradient-based algorithm for generative modelling. More importantly, our paper underpins rigorously this powerful approach by introducing a new F-entropy that is related to Fisher's divergence. (The F-entropy is also of independent interest.) The underpinning has enabled the gradient-based approach to expand its scope. For example, it can now provide a tool for Possible future projects include discrete data and Bayesian variational inference.
翻译:本文探讨统计学与生成式人工智能之间的相互作用。生成统计学作为后者的重要组成部分,旨在构建能够高效且有意义地生成跨越整个(通常为高维)样本空间的新数据(例如新照片)的模型。其中,基于梯度的方法是目前最受青睐的技术,它能有效利用观测样本(例如旧照片)中所包含的信息实现上述目标。然而,观测样本中常存在数据缺失(例如旧照片的像素缺损)。为应对这一情况,我们提出了一种用于生成建模的梯度算法。更重要的是,本文通过引入与Fisher散度相关的新型F熵,为这一强大方法奠定了严格的理论基础(该F熵本身亦具有独立研究价值)。这一理论支撑使基于梯度的方法得以拓展其应用范围。例如,该方法目前已可为离散数据与贝叶斯变分推断等未来潜在研究方向提供工具支持。