Distribution alignment can be used to learn invariant representations with applications in fairness and robustness. Most prior works resort to adversarial alignment methods but the resulting minimax problems are unstable and challenging to optimize. Non-adversarial likelihood-based approaches either require model invertibility, impose constraints on the latent prior, or lack a generic framework for alignment. To overcome these limitations, we propose a non-adversarial VAE-based alignment method that can be applied to any model pipeline. We develop a set of alignment upper bounds (including a noisy bound) that have VAE-like objectives but with a different perspective. We carefully compare our method to prior VAE-based alignment approaches both theoretically and empirically. Finally, we demonstrate that our novel alignment losses can replace adversarial losses in standard invariant representation learning pipelines without modifying the original architectures -- thereby significantly broadening the applicability of non-adversarial alignment methods.
翻译:分布对齐可用于学习不变表征,在公平性和鲁棒性领域具有应用价值。现有工作多采用对抗性对齐方法,但其产生的极小极大问题常存在不稳定性且优化困难。基于似然的非对抗方法或需要模型可逆性,或对潜在先验施加约束,或缺乏通用对齐框架。为突破这些限制,我们提出一种可适用于任意模型管道的非对抗性VAE对齐方法。我们构建了一套对齐上界(含噪声界),其目标函数虽具有VAE风格,但视角迥异。我们从理论与实证两个维度系统比较了本方法与现有VAE对齐技术。最后,我们证明所提出的新型对齐损失无需修改原始架构即可替代标准不变表征学习流程中的对抗损失——这显著拓展了非对抗对齐方法的适用边界。