Inference of community structure in probabilistic graphical models may not be consistent with fairness constraints when nodes have demographic attributes. Certain demographics may be over-represented in some detected communities and under-represented in others. This paper defines a novel $\ell_1$-regularized pseudo-likelihood approach for fair graphical model selection. In particular, we assume there is some community or clustering structure in the true underlying graph, and we seek to learn a sparse undirected graph and its communities from the data such that demographic groups are fairly represented within the communities. In the case when the graph is known a priori, we provide a convex semidefinite programming approach for fair community detection. We establish the statistical consistency of the proposed method for both a Gaussian graphical model and an Ising model for, respectively, continuous and binary data, proving that our method can recover the graphs and their fair communities with high probability.
翻译:概率图形模型中社区结构的推断在节点具有人口统计属性时可能无法满足公平性约束。某些人口群体可能在检测到的社区中过度代表,而其他群体则代表不足。本文定义了一种新颖的$\ell_1$正则化伪似然方法,用于公平图形模型选择。具体而言,我们假设真实底层图中存在某种社区或聚类结构,并旨在从数据中学习稀疏无向图及其社区,使得人口群体在社区内得到公平代表。当图已知时,我们提出了一种凸半定规划方法用于公平社区检测。我们针对连续数据的高斯图形模型和二元数据的Ising模型,分别证明了所提方法的统计一致性,表明该方法能以高概率恢复图形及其公平社区。