Deep learning algorithms demonstrate a surprising ability to learn high-dimensional tasks from limited examples. This is commonly attributed to the depth of neural networks, enabling them to build a hierarchy of abstract, low-dimensional data representations. However, how many training examples are required to learn such representations remains unknown. To quantitatively study this question, we introduce the Random Hierarchy Model: a family of synthetic tasks inspired by the hierarchical structure of language and images. The model is a classification task where each class corresponds to a group of high-level features, chosen among several equivalent groups associated with the same class. In turn, each feature corresponds to a group of sub-features chosen among several equivalent ones and so on, following a hierarchy of composition rules. We find that deep networks learn the task by developing internal representations invariant to exchanging equivalent groups. Moreover, the number of data required corresponds to the point where correlations between low-level features and classes become detectable. Overall, our results indicate how deep networks overcome the curse of dimensionality by building invariant representations, and provide an estimate of the number of data required to learn a hierarchical task.
翻译:深度学习算法展现出从有限样本中学习高维任务的惊人能力。这通常归因于神经网络的深度,使其能够构建抽象的低维数据表示的层级结构。然而,学习这些表示所需的训练样本数量仍未知。为了定量研究这一问题,我们引入了随机层级模型:受语言和图像层级结构启发的一系列合成任务。该模型是一个分类任务,其中每个类别对应一组高层特征,这些特征从与该类别相关的若干等价组中选取。进而,每个特征对应一组子特征,从若干等价子组中选取,依此类推,遵循组合规则的层级结构。我们发现,深度网络通过学习对等价组交换保持不变的内部表示来掌握任务。此外,所需数据量对应于低层特征与类别之间相关性变得可检测的临界点。总体而言,我们的结果表明了深度网络如何通过构建不变表示来克服维度灾难,并提供了学习层级任务所需数据量的估算。