A further understanding of cause and effect within observational data is critical across many domains, such as economics, health care, public policy, web mining, online advertising, and marketing campaigns. Although significant advances have been made to overcome the challenges in causal effect estimation with observational data, such as missing counterfactual outcomes and selection bias between treatment and control groups, the existing methods mainly focus on source-specific and stationary observational data. Such learning strategies assume that all observational data are already available during the training phase and from only one source. This practical concern of accessibility is ubiquitous in various academic and industrial applications. That's what it boiled down to: in the era of big data, we face new challenges in causal inference with observational data, i.e., the extensibility for incrementally available observational data, the adaptability for extra domain adaptation problem except for the imbalance between treatment and control groups, and the accessibility for an enormous amount of data. In this position paper, we formally define the problem of continual treatment effect estimation, describe its research challenges, and then present possible solutions to this problem. Moreover, we will discuss future research directions on this topic.
翻译:深入理解观测数据中的因果关系,对经济学、医疗保健、公共政策、网络挖掘、在线广告及营销活动等多个领域至关重要。尽管在克服观测数据因果效应估计的挑战(如缺失反事实结果、处理组与对照组间的选择偏差)方面已取得显著进展,但现有方法主要聚焦于特定来源的静态观测数据。此类学习策略假设所有观测数据在训练阶段即可获取,且仅来源于单一数据源。这种可获取性方面的现实问题在各种学术与工业应用中普遍存在。归根结底,在大数据时代,我们面临观测数据因果推断的新挑战:对增量式可用观测数据的可扩展性、除处理组与对照组不平衡外额外领域自适应问题的适应性,以及海量数据的可访问性。在本立场论文中,我们正式定义了持续处理效应估计问题,描述了其研究挑战,并提出了可能的解决方案。此外,我们还将探讨该主题未来的研究方向。