We consider rather general structural equation models (SEMs) between a target and its covariates in several shifted environments. Given $k\in\N$ shifts we consider the set of shifts that are at most $\gamma$-times as strong as a given weighted linear combination of these $k$ shifts and the worst (quadratic) risk over this entire space. This worst risk has a nice decomposition which we refer to as the "worst risk decomposition". Then we find an explicit arg-min solution that minimizes the worst risk and consider its corresponding plug-in estimator which is the main object of this paper. This plug-in estimator is (almost surely) consistent and we first prove a concentration in measure result for it. The solution to the worst risk minimizer is rather reminiscent of the corresponding ordinary least squares solution in that it is product of a vector and an inverse of a Grammian matrix. Due to this, the central moments of the plug-in estimator is not well-defined in general, but we instead consider these moments conditioned on the Grammian inverse being bounded by some given constant. We also study conditional variance of the estimator with respect to a natural filtration for the incoming data. Similarly we consider the conditional covariance matrix with respect to this filtration and prove a bound for the determinant of this matrix. This SEM model generalizes the linear models that have been studied previously for instance in the setting of casual inference or anchor regression but the concentration in measure result and the moment bounds are new even in the linear setting.
翻译:我们考虑目标变量与其协变量在多个偏移环境下的相当一般的结构方程模型(SEMs)。给定 $k\in\N$ 个偏移,我们考虑那些强度不超过给定加权线性组合 $\gamma$ 倍的所有偏移构成的集合,以及在整个空间上的最坏(二次)风险。该最坏风险具有一个良好的分解,我们称之为“最坏风险分解”。随后,我们找到了一个显式的最小化最坏风险的 arg-min 解,并考虑其对应的插件估计量,这是本文的主要研究对象。该插件估计量(几乎必然)是一致的,我们首先为其证明了一个测度集中性结果。最坏风险最小化器的解在形式上与相应的普通最小二乘解非常相似,因为它是一个向量与一个格拉姆矩阵逆的乘积。正因如此,该插件估计量的中心矩在一般情况下没有良好定义,但我们转而考虑在格拉姆矩阵逆受某个给定常数约束条件下的这些矩。我们还研究了该估计量相对于输入数据自然滤流的条件方差。类似地,我们考虑了相对于该滤流的条件协方差矩阵,并证明了该矩阵行列式的一个界。该 SEM 模型推广了先前研究的线性模型,例如在因果推断或锚回归设置中的模型,但即使在线性设置中,测度集中性结果和矩界也是新的。