Deep neural networks (DNNs) and, in particular, convolutional neural networks (CNNs) have brought significant advances in a wide range of modern computer application problems. However, the increasing availability of large amounts of datasets as well as the increasing available computational power of modern computers lead to a steady growth in the complexity and size of DNN and CNN models, and thus, to longer training times. Hence, various methods and attempts have been developed to accelerate and parallelize the training of complex network architectures. In this work, a novel CNN-DNN architecture is proposed that naturally supports a model parallel training strategy and that is loosely inspired by two-level domain decomposition methods (DDM). First, local CNN models, that is, subnetworks, are defined that operate on overlapping or nonoverlapping parts of the input data, for example, sub-images. The subnetworks can be trained completely in parallel. Each subnetwork outputs a local decision for the given machine learning problem which is exclusively based on the respective local input data. Subsequently, an additional DNN model is trained which evaluates the local decisions of the local subnetworks and generates a final, global decision. With respect to the analogy to DDM, the DNN can be interpreted as a coarse problem and hence, the new approach can be interpreted as a two-level domain decomposition. In this paper, solely image classification problems using CNNs are considered. Experimental results for different 2D image classification problems are provided as well as a face recognition problem, and a classification problem for 3D computer tomography (CT) scans. The results show that the proposed approach can significantly accelerate the required training time compared to the global model and, additionally, can also help to improve the accuracy of the underlying classification problem.
翻译:深度神经网络(DNN),特别是卷积神经网络(CNN),已在现代计算机应用的诸多领域取得了显著进展。然而,数据集的日益丰富以及现代计算机算力的持续提升,导致DNN与CNN模型的复杂度和规模不断增长,进而使训练时间不断延长。为此,研究人员开发了多种加速和并行化复杂网络架构训练的方法与尝试。本文提出一种新颖的CNN-DNN架构,该架构天然支持模型并行训练策略,并且其设计思想部分受两级区域分解方法(DDM)启发。首先,定义局部CNN模型(即子网络),这些子网络对输入数据的重叠或非重叠部分(例如子图像)进行运算。子网络可完全并行训练。每个子网络基于各自的局部输入数据,输出针对给定机器学习问题的局部决策。随后,训练一个额外的DNN模型,该模型评估局部子网络的局部决策,并生成最终全局决策。类比于DDM,DNN可视为粗尺度问题,因此新方法可解释为两级区域分解。本文仅考虑基于CNN的图像分类问题。实验涵盖了不同二维图像分类问题、人脸识别问题以及三维计算机断层扫描(CT)图像的分类问题。结果表明,与全局模型相比,所提方法能显著加速所需训练时间,同时有助于提升底层分类问题的准确率。