The SVHN Dataset Is Deceptive for Probabilistic Generative Models Due to a Distribution Mismatch

The Street View House Numbers (SVHN) dataset is a popular benchmark dataset in deep learning. Originally designed for digit classification tasks, the SVHN dataset has been widely used as a benchmark for various other tasks including generative modeling. However, with this work, we aim to warn the community about an issue of the SVHN dataset as a benchmark for generative modeling tasks: we discover that the official split into training set and test set of the SVHN dataset are not drawn from the same distribution. We empirically show that this distribution mismatch has little impact on the classification task (which may explain why this issue has not been detected before), but it severely affects the evaluation of probabilistic generative models, such as Variational Autoencoders and diffusion models. As a workaround, we propose to mix and re-split the official training and test set when SVHN is used for tasks other than classification. We publish a new split and the indices we used to create it at https://jzenn.github.io/svhn-remix/ .

翻译：街景房屋号码（SVHN）数据集是深度学习领域广泛采用的基准数据集之一。该数据集最初设计用于数字分类任务，但已被广泛用作生成建模等多种其他任务的基准。然而，本研究旨在提醒学界注意SVHN数据集作为生成建模基准时存在的问题：我们发现SVHN数据集的官方训练集与测试集划分并非源自同一分布。我们通过实验表明，这种分布不匹配对分类任务影响甚微（这或许解释了为何此问题此前未被发现），但会严重干扰概率生成模型（如变分自编码器和扩散模型）的评估。作为解决方案，我们建议在将SVHN用于非分类任务时，混合并重新划分官方训练集与测试集。我们在https://jzenn.github.io/svhn-remix/ 上发布了新的数据集划分方案及其使用的索引。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日