We study the statistical design of a fair mechanism that attains equalized odds, where an agent uses some useful data (database) $X$ to solve a task $T$. Since both $X$ and $T$ are correlated with some latent sensitive attribute $S$, the agent designs a representation $Y$ that satisfies an equalized odds, that is, such that $I(Y;S|T) =0$. In contrast to our previous work, we assume here that the agent has no direct access to $S$ and $T$; hence, the Markov chains $S - X - Y$ and $T - X - Y$ hold. Furthermore, we impose a geometric structure on the conditional distribution $P_{S|Y}$, allowing $Y$ and $S$ to have a small correlation, bounded by a threshold. When the threshold is small, concepts from information geometry allow us to approximate mutual information and reformulate the fair mechanism design problem as a quadratic program with closed-form solutions under certain constraints. For other cases, we derive simple, low-complexity lower bounds based on the maximum singular value and vector of a matrix. Finally, we compare our designs with the optimal solution in a numerical example.
翻译:我们研究了一种满足等几率条件的公平机制统计设计,其中智能体利用有用数据(数据库)$X$完成任务$T$。由于$X$和$T$均与潜在敏感属性$S$相关,智能体需设计满足等几率条件的表示$Y$,即满足$I(Y;S|T) =0$。与先前工作不同,本文假设智能体无法直接访问$S$和$T$,因此马尔可夫链$S - X - Y$和$T - X - Y$成立。此外,我们在条件分布$P_{S|Y}$上施加几何结构,允许$Y$与$S$存在受阈值限制的弱相关性。当阈值较小时,信息几何的概念使我们能够近似互信息,并将公平机制设计问题转化为具有闭式解的二次规划问题(在特定约束下)。对于其他情况,我们基于矩阵的最大奇异值及奇异向量推导出简洁的低复杂度下界。最后,通过数值算例将所提设计与最优解进行对比分析。