Towards Fair Disentangled Online Learning for Changing Environments

In the problem of online learning for changing environments, data are sequentially received one after another over time, and their distribution assumptions may vary frequently. Although existing methods demonstrate the effectiveness of their learning algorithms by providing a tight bound on either dynamic regret or adaptive regret, most of them completely ignore learning with model fairness, defined as the statistical parity across different sub-population (e.g., race and gender). Another drawback is that when adapting to a new environment, an online learner needs to update model parameters with a global change, which is costly and inefficient. Inspired by the sparse mechanism shift hypothesis, we claim that changing environments in online learning can be attributed to partial changes in learned parameters that are specific to environments and the rest remain invariant to changing environments. To this end, in this paper, we propose a novel algorithm under the assumption that data collected at each time can be disentangled with two representations, an environment-invariant semantic factor and an environment-specific variation factor. The semantic factor is further used for fair prediction under a group fairness constraint. To evaluate the sequence of model parameters generated by the learner, a novel regret is proposed in which it takes a mixed form of dynamic and static regret metrics followed by a fairness-aware long-term constraint. The detailed analysis provides theoretical guarantees for loss regret and violation of cumulative fairness constraints. Empirical evaluations on real-world datasets demonstrate our proposed method sequentially outperforms baseline methods in model accuracy and fairness.

翻译：在变化环境下的在线学习问题中，数据随时间序列化依次接收，其分布假设可能频繁变化。尽管现有方法通过提供动态遗憾或自适应遗憾的紧界来证明其学习算法的有效性，但绝大多数方法完全忽略了具有模型公平性的学习——即不同子群体（如种族和性别）之间的统计均等性。另一个缺陷在于，当适应新环境时，在线学习器需要通过全局更新模型参数，这种方式成本高昂且效率低下。受稀疏机制转移假说启发，我们认为在线学习中的环境变化可归因于学习参数的部分变化——其中特定环境的参数发生改变，而其余参数对环境变化保持不变。为此，本文提出一种新算法，其假设每个时刻收集的数据可通过两种表征进行解耦：环境不变的语义因子和环境特定的变异因子。语义因子进一步用于满足组公平约束下的公平预测。为评估学习器生成的模型参数序列，我们提出一种新型遗憾度量，该度量融合了动态与静态遗憾指标的混合形式，并伴随公平感知的长期约束。详细分析提供了损失遗憾与累积公平约束违反的理论保证。在真实数据集上的实证评估表明，我们所提方法的模型准确率和公平性均持续优于基线方法。