Aligning model representations to humans has been found to improve robustness and generalization. However, such methods often focus on standard observational data. Synthetic data is proliferating and powering many advances in machine learning; yet, it is not always clear whether synthetic labels are perceptually aligned to humans -- rendering it likely model representations are not human aligned. We focus on the synthetic data used in mixup: a powerful regularizer shown to improve model robustness, generalization, and calibration. We design a comprehensive series of elicitation interfaces, which we release as HILL MixE Suite, and recruit 159 participants to provide perceptual judgments along with their uncertainties, over mixup examples. We find that human perceptions do not consistently align with the labels traditionally used for synthetic points, and begin to demonstrate the applicability of these findings to potentially increase the reliability of downstream models, particularly when incorporating human uncertainty. We release all elicited judgments in a new data hub we call H-Mix.
翻译:将模型表示与人类对齐已被发现能提升鲁棒性和泛化能力。然而,此类方法通常聚焦于标准观测数据。合成数据正在蓬勃发展并推动机器学习的诸多进步,但合成标签是否在感知上与人类对齐并不总是明确的——这可能导致模型表示无法与人类对齐。我们聚焦于混合增强(mixup)中使用的合成数据:这是一种已被证明能提升模型鲁棒性、泛化能力和校准效果的强大正则化方法。我们设计了一系列全面的诱导界面(并将其发布为HILL MixE Suite),招募了159名参与者针对混合增强样本提供包含不确定性的感知判断。研究发现,人类感知与合成点传统使用的标签并不完全一致,我们开始论证这些发现如何能增强下游模型的可靠性,尤其是在融合人类不确定性时。我们将所有采集的判断结果发布在新创建的数据中心H-Mix中。