去偏机器学习统一框架：基于Bregman散度的Riesz表示元拟合 (A Unified Framework for Debiased Machine Learning: Riesz Representer Fitting under Bregman Divergence)

Estimating the Riesz representer is central to debiased machine learning for causal and structural parameter estimation. We propose generalized Riesz regression, a unified framework for estimating the Riesz representer by fitting a representer model via Bregman divergence minimization. This framework includes various divergences as special cases, such as the squared distance and the Kullback--Leibler (KL) divergence, where the former recovers Riesz regression and the latter recovers tailored loss minimization. Under suitable pairs of divergence and model specifications (link functions), the dual problems of the Riesz representer fitting problem correspond to covariate balancing, which we call automatic covariate balancing. Moreover, under the same specifications, the sample average of outcomes weighted by the estimated Riesz representer satisfies Neyman orthogonality even without estimating the regression function, a property we call automatic Neyman orthogonalization. This property not only reduces the estimation error of Neyman orthogonal scores but also clarifies a key distinction between debiased machine learning and targeted maximum likelihood estimation (TMLE). Our framework can also be viewed as a generalization of density ratio fitting under Bregman divergences to Riesz representer estimation, and it applies beyond density ratio estimation. We provide convergence analyses for both reproducing kernel Hilbert space (RKHS) and neural network model classes. A Python package for generalized Riesz regression is released as genriesz and is available at https://github.com/MasaKat0/genriesz.

翻译：Riesz表示元的估计是因果与结构参数估计中去偏机器学习的核心环节。本文提出广义Riesz回归——一种通过Bregman散度最小化拟合表示元模型的统一框架。该框架涵盖多种散度特例，例如平方距离与Kullback-Leibler（KL）散度：前者可还原为标准Riesz回归，后者则对应定制化损失最小化方法。在适当的散度-模型设定（链接函数）组合下，Riesz表示元拟合问题的对偶问题等价于协变量平衡，我们称之为自动协变量平衡。进一步地，在相同设定下，由估计的Riesz表示元加权的样本结果均值即使在不估计回归函数时仍满足Neyman正交性，这一性质我们称为自动Neyman正交化。该性质不仅降低了Neyman正交得分的估计误差，同时阐明了去偏机器学习与靶向最大似然估计（TMLE）的关键区别。本框架亦可视为Bregman散度下密度比拟合方法向Riesz表示元估计的推广，且其应用范围超越密度比估计范畴。我们针对再生核希尔伯特空间（RKHS）与神经网络模型类提供了收敛性分析。广义Riesz回归的Python软件包已以genriesz为名发布，可通过https://github.com/MasaKat0/genriesz获取。