A variety of interesting parameters may depend on high dimensional regressions. Machine learning can be used to estimate such parameters. However estimators based on machine learners can be severely biased by regularization and/or model selection. Debiased machine learning uses Neyman orthogonal estimating equations to reduce such biases. Debiased machine learning generally requires estimation of unknown Riesz representers. A primary innovation of this paper is to provide Riesz regression estimators of Riesz representers that depend on the parameter of interest, rather than explicit formulae, and that can employ any machine learner, including neural nets and random forests. End-to-end algorithms emerge where the researcher chooses the parameter of interest and the machine learner and the debiasing follows automatically. Another innovation here is debiased machine learners of parameters depending on generalized regressions, including high-dimensional generalized linear models. An empirical example of automatic debiased machine learning using neural nets is given. We find in Monte Carlo examples that automatic debiasing sometimes performs better than debiasing via inverse propensity scores and never worse. Finite sample mean square error bounds for Riesz regression estimators and asymptotic theory are also given.
翻译:许多有趣的参数可能依赖于高维回归。机器学习可用于估计此类参数,但基于机器学习器的估计量会因正则化和/或模型选择而产生严重偏差。去偏机器学习利用奈曼正交估计方程来减少此类偏差,通常需要估计未知的Riesz表示量。本文的主要创新点在于:提出了一种依赖于目标参数(而非显式公式)的Riesz回归估计量来估计Riesz表示量,该估计量可采用任意机器学习器(包括神经网络和随机森林)。由此产生的端到端算法中,研究者自主选择目标参数和机器学习器,去偏过程自动完成。另一创新在于针对广义回归(包括高维广义线性模型)依赖参数的去偏机器学习器。本文给出了基于神经网络的自动去偏机器学习实证案例。蒙特卡洛实验表明,自动去偏方法在某些情况下优于逆倾向分数去偏方法,且从未更差。同时给出了Riesz回归估计量的有限样本均方误差界与渐近理论。