In the field of artificial intelligence (AI), the quest to understand and model data-generating processes (DGPs) is of paramount importance. Deep generative models (DGMs) have proven adept in capturing complex data distributions but often fall short in generalization and interpretability. On the other hand, causality offers a structured lens to comprehend the mechanisms driving data generation and highlights the causal-effect dynamics inherent in these processes. While causality excels in interpretability and the ability to extrapolate, it grapples with intricacies of high-dimensional spaces. Recognizing the synergistic potential, we delve into the confluence of causality and DGMs. We elucidate the integration of causal principles within DGMs, investigate causal identification using DGMs, and navigate an emerging research frontier of causality in large-scale generative models, particularly generative large language models (LLMs). We offer insights into methodologies, highlight open challenges, and suggest future directions, positioning our comprehensive review as an essential guide in this swiftly emerging and evolving area.
翻译:在人工智能领域,理解并建模数据生成过程具有至关重要的研究价值。深度生成模型虽能有效捕获复杂数据分布,却在泛化能力与可解释性方面存在局限。因果性则提供了结构化视角来解读数据生成机制,并揭示其中蕴含的因果效应动力学。尽管因果性在可解释性与外推能力方面表现优异,但面对高维空间的复杂性仍显力有未逮。认识到二者潜在的协同效应,我们深入探究因果性与深度生成模型的交汇领域:阐释因果原理在深度生成模型中的整合机制,探讨利用深度生成模型进行因果识别的方法,并探索大规模生成模型(特别是大型生成语言模型)中因果性研究的新兴前沿。我们系统梳理方法论体系,揭示当前开放性挑战,指明未来研究方向,从而为本快速发展的新兴领域提供重要的综合性指南。