Voice biometric systems can exhibit sex-related performance gaps even when overall verification accuracy is strong. We attribute these gaps to two practical mechanisms: (i) demographic shortcut learning, where speaker classification training exploits spurious correlations between sex and speaker identity, and (ii) feature entanglement, where sex-linked acoustic variation overlaps with identity cues and cannot be removed without degrading speaker discrimination. We propose Fair-Gate, a fairness-aware and interpretable risk-gating framework that addresses both mechanisms in a single pipeline. Fair-Gate applies risk extrapolation to reduce variation in speaker-classification risk across proxy sex groups, and introduces a local complementary gate that routes intermediate features into an identity branch and a sex branch. The gate provides interpretability by producing an explicit routing mask that can be inspected to understand which features are allocated to identity versus sex-related pathways. Experiments on VoxCeleb1 show that Fair-Gate improves the utility--fairness trade-off, yielding more sex-fair ASV performance under challenging evaluation conditions.
翻译:语音生物特征系统即使整体验证准确率较高,也可能存在与性别相关的性能差异。我们将这些差异归因于两种实际机制:(i)人口统计学捷径学习,即说话人分类训练利用性别与说话人身份之间的虚假相关性;(ii)特征纠缠,即与性别相关的声学变异与身份线索重叠,无法在不损害说话人判别能力的情况下消除。我们提出Fair-Gate,这是一个兼具公平感知与可解释性的风险门控框架,可在单一流程中同时处理这两种机制。Fair-Gate通过风险外推减少代理性别组间说话人分类风险的差异,并引入局部互补门控,将中间特征路由至身份分支与性别分支。该门控通过生成显式路由掩码提供可解释性,研究人员可通过检查该掩码理解哪些特征被分配至身份相关路径与性别相关路径。在VoxCeleb1上的实验表明,Fair-Gate优化了效用与公平性的权衡,在具有挑战性的评估条件下实现了更性别公平的ASV性能。