A further understanding of cause and effect within observational data is critical across many domains, such as economics, health care, public policy, web mining, online advertising, and marketing campaigns. Although significant advances have been made to overcome the challenges in causal effect estimation with observational data, such as missing counterfactual outcomes and selection bias between treatment and control groups, the existing methods mainly focus on source-specific and stationary observational data. Such learning strategies assume that all observational data are already available during the training phase and from only one source. This practical concern of accessibility is ubiquitous in various academic and industrial applications. That's what it boiled down to: in the era of big data, we face new challenges in causal inference with observational data, i.e., the extensibility for incrementally available observational data, the adaptability for extra domain adaptation problem except for the imbalance between treatment and control groups, and the accessibility for an enormous amount of data. In this position paper, we formally define the problem of continual treatment effect estimation, describe its research challenges, and then present possible solutions to this problem. Moreover, we will discuss future research directions on this topic.
翻译:在观测数据中深入理解因果关系对于经济学、医疗保健、公共政策、网络挖掘、在线广告及市场营销等诸多领域至关重要。尽管在克服观测数据因果效应估计中的挑战(如缺失反事实结果以及处理组与对照组之间的选择偏差)方面取得了显著进展,现有方法主要聚焦于特定来源的静态观测数据。此类学习策略假设所有观测数据在训练阶段即可获取且仅来自单一来源。这种实际可及性问题在各类学术与工业应用中普遍存在。究其根本:在大数据时代,我们面临利用观测数据进行因果推断的新挑战,即对增量可用观测数据的可扩展性,除处理组与对照组不平衡之外额外领域自适应问题的适应性,以及对海量数据的可访问性。作为立场论文,本文正式定义了持续处理效应估计问题,阐述了其研究挑战,并提出了可能的解决方案。此外,我们将讨论该主题的未来研究方向。