The widespread adoption of deep-learning models in data-driven applications has drawn attention to the potential risks associated with biased datasets and models. Neglected or hidden biases within datasets and models can lead to unexpected results. This study addresses the challenges of dataset bias and explores ``shortcut learning'' or ``Clever Hans effect'' in binary classifiers. We propose a novel framework for analyzing the black-box classifiers and for examining the impact of both training and test data on classifier scores. Our framework incorporates intervention and observational perspectives, employing a linear mixed-effects model for post-hoc analysis. By evaluating classifier performance beyond error rates, we aim to provide insights into biased datasets and offer a comprehensive understanding of their influence on classifier behavior. The effectiveness of our approach is demonstrated through experiments on audio anti-spoofing and speaker verification tasks using both statistical models and deep neural networks. The insights gained from this study have broader implications for tackling biases in other domains and advancing the field of explainable artificial intelligence.
翻译:深度学习模型在数据驱动应用中的广泛采用,已引起人们对与有偏数据集和模型相关的潜在风险的关注。数据集和模型中被忽视或隐藏的偏差可能导致意外结果。本研究针对数据集偏差的挑战,探讨了二元分类器中的“捷径学习”或“Clever Hans效应”。我们提出了一种新颖的框架,用于分析黑盒分类器,并检验训练数据和测试数据对分类器分数的影响。该框架融合了干预和观测视角,采用线性混合效应模型进行事后分析。通过评估超出错误率的分类器性能,我们旨在为有偏数据集提供洞见,并对其如何影响分类器行为提供全面理解。我们通过在音频反欺骗和说话人验证任务上使用统计模型和深度神经网络进行实验,证明了所提方法的有效性。本研究获得的见解对于解决其他领域的偏差问题以及推进可解释人工智能领域的发展具有更广泛的意义。