Inference of community structure in probabilistic graphical models may not be consistent with fairness constraints when nodes have demographic attributes. Certain demographics may be over-represented in some detected communities and under-represented in others. This paper defines a novel $\ell_1$-regularized pseudo-likelihood approach for fair graphical model selection. In particular, we assume there is some community or clustering structure in the true underlying graph, and we seek to learn a sparse undirected graph and its communities from the data such that demographic groups are fairly represented within the communities. In the case when the graph is known a priori, we provide a convex semidefinite programming approach for fair community detection. We establish the statistical consistency of the proposed method for both a Gaussian graphical model and an Ising model for, respectively, continuous and binary data, proving that our method can recover the graphs and their fair communities with high probability.
翻译:当节点具有人口统计学属性时,概率图模型中的社区结构推断可能与公平性约束不一致。某些人口统计特征可能在部分检测到的社区中过度代表,而在其他社区中代表不足。本文提出了一种新颖的$\ell_1$正则化伪似然方法用于公平图模型选择。特别地,我们假设真实底层图中存在社区或聚类结构,并致力于从数据中学习稀疏无向图及其社区,使得人口统计群体在社区内得到公平代表。在图结构已知的情况下,我们提出了一种凸半定规划方法用于公平社区检测。我们分别针对连续数据和二元数据的高斯图模型与伊辛模型,证明了所提方法的统计一致性,从理论上证实该方法能以高概率恢复图结构及其公平社区。