Fairness in human-robot interaction critically depends on the reliability of the perceptual models that enable robots to interpret human behavior. While demographic biases have been widely studied in high-level facial analysis tasks, their presence in facial landmark detection remains unexplored. In this paper, we conduct a systematic audit of demographic bias in this task, analyzing the age, gender, and race biases. To this end, we introduce a controlled statistical methodology to disentangle demographic effects from confounding visual factors. Our analysis demonstrates that visual confounders, particularly head pose and face resolution, heavily outweigh the impact of demographic attributes. Notably, after accounting for these confounders, performance disparities across gender and race vanish. However, we identify a statistically significant age-related bias, with higher localization errors for older individuals. This shows that fairness issues can emerge even in low-level vision components and can propagate through the HRI pipeline. We argue that auditing and correcting such biases is a necessary step toward trustworthy and equitable robot perception systems.
翻译:人机交互的公平性关键取决于使机器人能够解读人类行为的感知模型的可靠性。尽管人口统计偏差已在高级面部分析任务中被广泛研究,但其在面部关键点检测中的存在性仍未探明。本文系统性地审计了该任务中的人口统计偏差,分析了年龄、性别和种族偏差。为此,我们引入了一种受控统计方法,将人口统计效应与视觉混淆因素分离。分析表明,头部姿态和面部分辨率等视觉混淆因素对模型性能的影响远超人口统计属性。值得注意的是,在控制这些混淆因素后,性别和种族间的性能差异消失。然而,我们识别出具有统计学意义的年龄相关偏差:老年人存在更高的定位误差。这表明公平性问题甚至可能出现在低级视觉组件中,并通过人机交互流水线传播。我们认为,审计并纠正此类偏差是实现可信赖且公平的机器人感知系统的必要步骤。