Many researchers have applied classical statistical decision theory to evaluate treatment choices and learn optimal policies. However, because this framework is based solely on realized outcomes under chosen decisions and ignores counterfactual outcomes, it cannot assess the quality of a decision relative to feasible alternatives. For example, in bail decisions, a judge must consider not only crime prevention but also the avoidance of unnecessary burdens on arrestees. To address this limitation, we generalize standard decision theory by incorporating counterfactual losses, allowing decisions to be evaluated using all potential outcomes. The central challenge in this counterfactual statistical decision framework is identification: since only one potential outcome is observed for each unit, the associated counterfactual risk is generally not identifiable. We prove that, under the assumption of strong ignorability, the counterfactual risk is identifiable if and only if the counterfactual loss function is additive in the potential outcomes. Moreover, we demonstrate that additive counterfactual losses can yield treatment recommendations, which differ from those based on standard loss functions when the decision problem involves more than two treatment options. One interpretation of this result is that additive counterfactual losses can capture the accuracy and difficulty of a decision, whereas standard losses account for accuracy alone. Finally, we formulate a symbolic linear inverse program that, given a counterfactual loss, determines whether its risk is identifiable, without requiring data.
翻译:许多研究者已应用经典统计决策理论来评估治疗选择并学习最优策略。然而,由于该框架仅基于已实现决策下的观测结果而忽略反事实结果,它无法评估决策相对于可行替代方案的质量。例如,在保释决策中,法官不仅需考虑预防犯罪,还需避免对被捕者施加不必要的负担。为克服这一局限,我们通过引入反事实损失对标准决策理论进行推广,使得决策能够基于所有潜在结果进行评估。该反事实统计决策框架的核心挑战在于可识别性:由于每个单元仅能观测到一个潜在结果,相应的反事实风险通常不可识别。我们证明,在强可忽略性假设下,当且仅当反事实损失函数在潜在结果上具有可加性时,反事实风险才是可识别的。进一步,我们论证了当决策问题涉及两种以上处理选项时,可加性反事实损失可能产生与标准损失函数不同的治疗推荐方案。这一结果的一种解释是:可加性反事实损失能够同时捕捉决策的准确性与难度,而标准损失仅考虑准确性。最后,我们构建了一个符号线性逆规划方法,该方法可在无需实际数据的情况下,针对给定反事实损失函数判定其风险是否可识别。