Toward Individual Fairness Without Centralized Data: Selective Counterfactual Consistency for Vertical Federated Learning

When algorithmic decisions depend on data distributed across institutions, how can we ensure that an individual's outcome does not change arbitrarily based on a protected attribute? We study this question in vertical federated learning (VFL), where features are split across parties, sensitive attributes may be private, and proxies for protected characteristics can be scattered across institutional boundaries under strict privacy constraints. Our focus is on individual-level counterfactual stability, i.e., per-instance prediction consistency under protected-attribute interventions as formalized in the causal fairness literature, rather than group parity guarantees such as demographic parity or equalized odds. We propose SCC-VFL, a server-centric framework for enforcing selective counterfactual consistency (SCC) at the individual level in VFL. SCC-VFL operationalizes a given policy specification by combining three components: (i) differentially private, graph-free discovery of feature roles into non-descendants, policy-permitted mediators, and impermissible proxies using only a formally private sketch of the sensitive attribute, with a formal per-release privacy that does not extend to the full training pipeline; (ii) masked counterfactual generation that edits only mediators while fixing non-descendants and suppressing proxy leakage; and (iii) server-side enforcement via an SCC consistency loss that penalizes impermissible prediction changes under protected-attribute interventions. Across three real-world datasets spanning credit, healthcare, and criminal justice, SCC-VFL maintains or improves predictive accuracy while sharply reducing decision flip rates by up to 98% relative to strong baselines. It also lowers attribute-inference attack success and improves robustness, demonstrating favorable utility-fairness-privacy trade-offs in realistic VFL deployments.

翻译：当算法决策依赖于跨机构分布的数据时，我们如何确保个体的结果不会因受保护属性而任意改变？我们在纵向联邦学习（VFL）中研究这一问题，其中特征在各方之间拆分，敏感属性可能为私有，且受保护特征的代理变量在严格隐私约束下分散于机构界限之间。我们的关注点在于个体层面的反事实稳定性，即因果公平性文献中形式化定义的、在受保护属性干预下逐实例预测的一致性，而非群体层面的均等性保证（如人口统计均等或均等机会）。我们提出SCC-VFL，一种在VFL中强制执行个体层面选择性反事实一致性（SCC）的服务器中心框架。SCC-VFL通过结合三个组件实现给定策略规范的操作化：（i）差分隐私的、无图结构特征角色发现——将特征区分为非后代变量、策略允许的中介变量及不可允许的代理变量，仅使用敏感属性的形式化私有摘要，并附带不扩展至完整训练管道的逐次发布形式化隐私保证；（ii）掩码反事实生成——仅编辑中介变量，同时固定非后代变量并抑制代理泄露；（iii）通过SCC一致性损失在服务器端强制执行——该损失对受保护属性干预下不可允许的预测变化施加惩罚。在横跨信贷、医疗及刑事司法领域的三个真实数据集上，SCC-VFL在保持或提升预测精度的同时，将决策翻转率较强势基线大幅降低高达98%。它同时降低了属性推断攻击成功率并提升了鲁棒性，展示了在现实VFL部署中有利的效用-公平性-隐私权衡。