Causal DAGs(Directed Acyclic Graphs) are usually considered in a 2D plane. Edges indicate causal effects' directions and imply their corresponding time-passings. Due to the natural restriction of statistical models, effect estimation is usually approximated by averaging the individuals' correlations, i.e., observational changes over a specific time. However, in the context of Machine Learning on large-scale questions with complex DAGs, such slight biases can snowball to distort global models - More importantly, it has practically impeded the development of AI, for instance, the weak generalizability of causal models. In this paper, we redefine causal DAG as \emph{do-DAG}, in which variables' values are no longer time-stamp-dependent, and timelines can be seen as axes. By geometric explanation of multi-dimensional do-DAG, we identify the \emph{Causal Representation Bias} and its necessary factors, differentiated from common confounding biases. Accordingly, a DL(Deep Learning)-based framework will be proposed as the general solution, along with a realization method and experiments to verify its feasibility.
翻译:因果有向无环图通常被考虑在二维平面中。边表示因果效应的方向,并隐含其对应的时间流逝。由于统计模型的自然限制,效应估计通常通过对个体相关性(即特定时间内的观测变化)取平均来近似。然而,在涉及复杂大规模有向无环图的机器学习背景下,这种微小偏差可能累积放大,从而扭曲全局模型——更重要的是,它实际上阻碍了人工智能的发展,例如因果模型的弱泛化能力。在本文中,我们将因果有向无环图重新定义为\emph{do-有向无环图},其中变量的值不再依赖于时间戳,时间线可被视为坐标轴。通过对多维do-有向无环图的几何解释,我们识别了\emph{因果表示偏差}及其必要因素,并将其与常见的混杂偏差区分开来。据此,本文将提出一种基于深度学习的通用解决方案框架,并给出一种实现方法及实验以验证其可行性。