Despite the growing popularity of machine-learning techniques in decision-making, the added value of causal-oriented strategies with respect to pure machine-learning approaches has rarely been quantified in the literature. These strategies are crucial for practitioners in various domains, such as marketing, telecommunications, health care and finance. This paper presents a comprehensive treatment of the subject, starting from firm theoretical foundations and highlighting the parameters that influence the performance of the uplift and predictive approaches. The focus of the paper is on a binary outcome case and a binary action, and the paper presents a theoretical analysis of uplift modeling, comparing it with the classical predictive approach. The main research contributions of the paper include a new formulation of the measure of profit, a formal proof of the convergence of the uplift curve to the measure of profit ,and an illustration, through simulations, of the conditions under which predictive approaches still outperform uplift modeling. We show that the mutual information between the features and the outcome plays a significant role, along with the variance of the estimators, the distribution of the potential outcomes and the underlying costs and benefits of the treatment and the outcome.
翻译:尽管机器学习技术在决策中日益普及,但文献中鲜有量化因果导向策略相对于纯机器学习方法的附加价值。这些策略对于市场营销、电信、医疗保健和金融等领域的实践者至关重要。本文对该主题进行了全面探讨,从坚实的理论基础出发,重点揭示了影响提升建模与预测建模性能的参数。研究聚焦于二元结果与二元动作情形,通过理论分析对比了提升建模与经典预测方法。主要研究贡献包括:提出了利润度量的新公式,严格证明了提升曲线收敛于利润度量,并通过模拟阐明了预测方法仍优于提升建模的条件。研究表明,特征与结果之间的互信息、估计量的方差、潜在结果的分布以及处理与结果的潜在成本收益均具有显著影响。