We propose a framework that aligns Conditional Average Treatment Effect (CATE) estimation with profit maximization. Our method recognizes that, for customers with extreme treatment effects, additional estimation accuracy is unlikely to change the recommended actions. In contrast, accuracy is critical near the decision boundary, where treatment effects are close to treatment costs. Our approach optimizes a novel objective function that concentrates learning capacity along this boundary. The proposed objective is Fisher consistent with respect to the original profit function and yields a consistent estimator for CATEs. Theoretically, our framework unifies standard plug-in optimization and direct policy optimization as limiting cases of the same optimization problem. We further show that entropy-regularized policy optimization is a special case of our framework. This result has a direct practical implication: firms can recover consistent CATE estimates from existing profit-maximization pipelines. We use synthetic data to demonstrate how the proposed framework allows firms to explicitly navigate the trade-off between global prediction accuracy and profit maximization.
翻译:我们提出一个框架,将条件平均处理效应(CATE)估计与利润最大化对齐。该方法认识到,对于处理效应极端的客户,额外的估计精度不太可能改变推荐动作。相比之下,在决策边界附近(即处理效应接近处理成本时),精度至关重要。我们的方法优化了一个新颖的目标函数,将学习能力集中在决策边界附近。该目标函数相对于原始利润函数具有Fisher一致性,并产生CATE的一致估计量。理论上,该框架将标准插件优化和直接策略优化统一为同一优化问题的极限情形。我们进一步证明,熵正则化策略优化是该框架的一个特例。这一结果具有直接实践意义:企业可以从现有利润最大化管线中恢复一致的CATE估计。我们使用合成数据展示该框架如何使企业明确地在全局预测精度与利润最大化之间进行权衡。