Moment restrictions and their conditional counterparts emerge in many areas of machine learning and statistics ranging from causal inference to reinforcement learning. Estimators for these tasks, generally called methods of moments, include the prominent generalized method of moments (GMM) which has recently gained attention in causal inference. GMM is a special case of the broader family of empirical likelihood estimators which are based on approximating a population distribution by means of minimizing a $\varphi$-divergence to an empirical distribution. However, the use of $\varphi$-divergences effectively limits the candidate distributions to reweightings of the data samples. We lift this long-standing limitation and provide a method of moments that goes beyond data reweighting. This is achieved by defining an empirical likelihood estimator based on maximum mean discrepancy which we term the kernel method of moments (KMM). We provide a variant of our estimator for conditional moment restrictions and show that it is asymptotically first-order optimal for such problems. Finally, we show that our method achieves competitive performance on several conditional moment restriction tasks.
翻译:矩约束及其条件变体在机器学习和统计学的多个领域中出现,范围从因果推断到强化学习。这些任务的估计量通常称为矩方法,包括著名的广义矩方法(GMM),该方法最近在因果推断中受到关注。GMM是更广泛的似然比估计族的一个特例,这些估计基于通过最小化一个 $\varphi$-散度来近似总体分布与经验分布。然而,使用 $\varphi$-散度实际上将候选分布限制为数据样本的重加权。我们突破了这一长期存在的限制,并提供了一种超越数据重加权的矩方法。这是通过定义一个基于最大均值差异的似然比估计实现的,我们将其称为核矩方法(KMM)。我们为条件矩约束提供了我们估计量的一个变体,并证明它在渐近意义上对于这类问题是一阶最优的。最后,我们展示了我们的方法在若干条件矩约束任务上达到了具有竞争力的性能。