Distribution shifts are a major source of failure of deployed machine learning models. However, evaluating a model's reliability under distribution shifts can be challenging, especially since it may be difficult to acquire counterfactual examples that exhibit a specified shift. In this work, we introduce dataset interfaces: a framework which allows users to scalably synthesize such counterfactual examples from a given dataset. Specifically, we represent each class from the input dataset as a custom token within the text space of a text-to-image diffusion model. By incorporating these tokens into natural language prompts, we can then generate instantiations of objects in that dataset under desired distribution shifts. We demonstrate how applying our framework to the ImageNet dataset enables us to study model behavior across a diverse array of shifts, including variations in background, lighting, and attributes of the objects themselves. Code available at https://github.com/MadryLab/dataset-interfaces.
翻译:分布偏移是部署的机器学习模型失败的主要来源。然而,评估模型在分布偏移下的可靠性可能具有挑战性,尤其是当难以获取表现出特定偏移的反事实示例时。在这项工作中,我们引入了数据集接口:一个允许用户从给定数据集中可扩展地合成此类反事实示例的框架。具体来说,我们将输入数据集中的每个类别表示为文本到图像扩散模型文本空间中的一个自定义标记(token)。通过将这些标记纳入自然语言提示中,我们可以在期望的分布偏移下生成该数据集中对象的实例化。我们展示了将我们的框架应用于ImageNet数据集如何使我们能够研究模型在多种偏移下的行为,包括背景、光照以及对象本身属性的变化。代码可在 https://github.com/MadryLab/dataset-interfaces 获取。