Instrumental variable (IV) regression can be approached through its formulation in terms of conditional moment restrictions (CMR). Building on variants of the generalized method of moments, most CMR estimators are implicitly based on approximating the population data distribution via reweightings of the empirical sample. While for large sample sizes, in the independent identically distributed (IID) setting, reweightings can provide sufficient flexibility, they might fail to capture the relevant information in presence of corrupted data or data prone to adversarial attacks. To address these shortcomings, we propose the Sinkhorn Method of Moments, an optimal transport-based IV estimator that takes into account the geometry of the data manifold through data-derivative information. We provide a simple plug-and-play implementation of our method that performs on par with related estimators in standard settings but improves robustness against data corruption and adversarial attacks.
翻译:工具变量(IV)回归可通过条件矩约束(CMR)的表述进行求解。基于广义矩方法的变体,大多数CMR估计量本质上依赖经验样本的重加权来近似总体数据分布。在独立同分布(IID)设定下,尽管重加权方法在大样本量时能提供足够灵活性,但在数据受污染或易受对抗攻击的情况下,这类方法可能无法有效捕捉相关信息。为解决上述局限,我们提出Sinkhorn矩方法——一种基于最优传输的工具变量估计量,通过数据导数信息感知数据流形的几何结构。我们提供该方法即插即用的简易实现,其在标准设定下性能媲美相关估计量,但能显著提升对数据污染和对抗攻击的鲁棒性。