We introduce the Conditional Independence Regression CovariancE (CIRCE), a measure of conditional independence for multivariate continuous-valued variables. CIRCE applies as a regularizer in settings where we wish to learn neural features $\varphi(X)$ of data $X$ to estimate a target $Y$, while being conditionally independent of a distractor $Z$ given $Y$. Both $Z$ and $Y$ are assumed to be continuous-valued but relatively low dimensional, whereas $X$ and its features may be complex and high dimensional. Relevant settings include domain-invariant learning, fairness, and causal learning. The procedure requires just a single ridge regression from $Y$ to kernelized features of $Z$, which can be done in advance. It is then only necessary to enforce independence of $\varphi(X)$ from residuals of this regression, which is possible with attractive estimation properties and consistency guarantees. By contrast, earlier measures of conditional feature dependence require multiple regressions for each step of feature learning, resulting in more severe bias and variance, and greater computational cost. When sufficiently rich features are used, we establish that CIRCE is zero if and only if $\varphi(X) \perp \!\!\! \perp Z \mid Y$. In experiments, we show superior performance to previous methods on challenging benchmarks, including learning conditionally invariant image features.
翻译:我们引入了条件独立回归协方差(CIRCE),一种用于多变量连续值变量的条件独立性度量。CIRCE 作为正则化器应用于需要学习数据 $X$ 的神经特征 $\varphi(X)$ 以估计目标 $Y$,同时要求 $\varphi(X)$ 在给定 $Y$ 条件下与干扰变量 $Z$ 条件独立的场景。假设 $Z$ 和 $Y$ 均为连续值但维度较低,而 $X$ 及其特征可能复杂且高维。相关应用包括域不变学习、公平性学习和因果学习。该方法仅需对从 $Y$ 到 $Z$ 的核化特征进行一次岭回归,该步骤可预先完成。随后只需强制 $\varphi(X)$ 与该回归的残差独立,这具有优良的估计性质和一致性保证。相比之下,先前的条件特征依赖度量需要在特征学习的每一步进行多次回归,导致更严重的偏差和方差,以及更高的计算成本。当使用足够丰富的特征时,我们证明 CIRCE 为零当且仅当 $\varphi(X) \perp \!\!\! \perp Z \mid Y$。实验中,我们在具有挑战性的基准测试(包括学习条件不变图像特征)上展示了优于先前方法的性能。