Causal DAGs(Directed Acyclic Graphs) are usually considered in a 2D plane. Edges indicate causal effects' directions and imply their corresponding time-passings. Due to the natural restriction of statistical models, effect estimation is usually approximated by averaging the individuals' correlations, i.e., observational changes over a specific time. However, in the context of Machine Learning on large-scale questions with complex DAGs, such slight biases can snowball to distort global models - More importantly, it has practically impeded the development of AI, for instance, the weak generalizability of causal models. In this paper, we redefine causal DAG as \emph{do-DAG}, in which variables' values are no longer time-stamp-dependent, and timelines can be seen as axes. By geometric explanation of multi-dimensional do-DAG, we identify the \emph{Causal Representation Bias} and its necessary factors, differentiated from common confounding biases. Accordingly, a DL(Deep Learning)-based framework will be proposed as the general solution, along with a realization method and experiments to verify its feasibility.
翻译:因果DAG(有向无环图)通常被考虑在二维平面上。边指示因果效应的方向并暗示其对应的时间流逝。由于统计模型的自然限制,效应估计通常通过对个体相关性的平均来近似,即特定时间内的观测变化。然而,在处理具有复杂DAG的大规模问题的机器学习背景下,这些微小偏差可能会不断累积,从而扭曲全局模型——更重要的是,这在实践中阻碍了人工智能的发展,例如因果模型的弱泛化能力。在本文中,我们将因果DAG重新定义为\emph{do-DAG},其中变量的值不再依赖于时间戳,时间线可以被视为坐标轴。通过对多维do-DAG的几何解释,我们识别出\emph{因果表示偏差}及其必要因素,并将其与常见的混杂偏差区分开来。据此,将提出一个基于深度学习(DL)的框架作为通用解决方案,并附有实现方法和实验以验证其可行性。