Individuals with impaired hearing experience difficulty in conversations, especially in noisy environments. This difficulty often manifests as a change in behavior and may be captured via facial expressions, such as the expression of discomfort or fatigue. In this work, we build on this idea and introduce the problem of detecting hearing loss from an individual's facial expressions during a conversation. Building machine learning models that can represent hearing-related facial expression changes is a challenge. In addition, models need to disentangle spurious age-related correlations from hearing-driven expressions. To this end, we propose a self-supervised pre-training strategy tailored for the modeling of expression variations. We also use adversarial representation learning to mitigate the age bias. We evaluate our approach on a large-scale egocentric dataset with real-world conversational scenarios involving subjects with hearing loss and show that our method for hearing loss detection achieves superior performance over baselines.
翻译:听力受损者在对话中会遇到困难,尤其是在嘈杂环境中。这种困难常表现为行为变化,并可通过面部表情(如不适或疲劳的表情)捕捉。本文基于这一思路,提出从个体对话期间的面部表情中检测听力损失的问题。构建能够表征听力相关面部表情变化的机器学习模型是一项挑战。此外,模型还需从听力驱动的表情中分离出与年龄相关的虚假关联。为此,我们提出了一种针对表情变化建模的自监督预训练策略,并采用对抗表示学习来减轻年龄偏差。我们在一个包含真实对话场景的大规模自我中心数据集上评估了该方法,该数据集涉及听力损失受试者,结果表明我们所提出的听力损失检测方法在性能上优于基线方法。