Training dataset biases are by far the most scrutinized factors when explaining algorithmic biases of neural networks. In contrast, hyperparameters related to the neural network architecture, e.g., the number of layers or choice of activation functions, have largely been ignored even though different network parameterizations are known to induce different implicit biases over learned features. For example, convolutional kernel size has been shown to bias CNNs towards different frequencies. In order to study the effect of these hyperparameters, we designed a causal framework for linking an architectural hyperparameter to algorithmic bias. Our framework is experimental, in that several versions of a network are trained with an intervention to a specific hyperparameter, and the resulting causal effect of this choice on performance bias is measured. We focused on the causal relationship between sensitivity to high-frequency image details and face analysis classification performance across different subpopulations (race/gender). In this work, we show that modifying a CNN hyperparameter (convolutional kernel size), even in one layer of a CNN, will not only change a fundamental characteristic of the learned features (frequency content) but that this change can vary significantly across data subgroups (race/gender populations) leading to biased generalization performance even in the presence of a balanced dataset.
翻译:摘要:训练数据集偏差是解释神经网络算法偏差时最受关注的因素。相比之下,与神经网络架构相关的超参数(例如层数或激活函数的选择)在很大程度上被忽视,尽管已知不同的网络参数化会对学习到的特征产生不同的隐式偏差。例如,卷积核大小已被证明会使CNN偏向不同频率。为了研究这些超参数的影响,我们设计了一个因果框架,用于将架构超参数与算法偏差联系起来。我们的框架是实验性的,即通过干预特定超参数训练多个版本的网络,并测量该选择对性能偏差产生的因果效应。我们重点关注高频图像细节敏感度与跨不同子群体(种族/性别)的人脸分析分类性能之间的因果关系。在本研究中,我们证明,修改CNN超参数(卷积核大小),即使仅修改CNN中的一层,不仅会改变学习特征的基本特性(频率内容),而且这种变化在不同数据子组(种族/性别群体)之间可能存在显著差异,从而导致即使在数据集平衡的情况下也出现有偏的泛化性能。