DeepMAD: Mathematical Architecture Design for Deep Convolutional Neural Network

The rapid advances in Vision Transformer (ViT) refresh the state-of-the-art performances in various vision tasks, overshadowing the conventional CNN-based models. This ignites a few recent striking-back research in the CNN world showing that pure CNN models can achieve as good performance as ViT models when carefully tuned. While encouraging, designing such high-performance CNN models is challenging, requiring non-trivial prior knowledge of network design. To this end, a novel framework termed Mathematical Architecture Design for Deep CNN (DeepMAD) is proposed to design high-performance CNN models in a principled way. In DeepMAD, a CNN network is modeled as an information processing system whose expressiveness and effectiveness can be analytically formulated by their structural parameters. Then a constrained mathematical programming (MP) problem is proposed to optimize these structural parameters. The MP problem can be easily solved by off-the-shelf MP solvers on CPUs with a small memory footprint. In addition, DeepMAD is a pure mathematical framework: no GPU or training data is required during network design. The superiority of DeepMAD is validated on multiple large-scale computer vision benchmark datasets. Notably on ImageNet-1k, only using conventional convolutional layers, DeepMAD achieves 0.7% and 1.5% higher top-1 accuracy than ConvNeXt and Swin on Tiny level, and 0.8% and 0.9% higher on Small level.

翻译：视觉Transformer（ViT）的快速发展刷新了各类视觉任务的最新性能表现，使得传统的基于CNN的模型相形见绌。这激发了近期CNN领域的若干反击性研究，表明纯CNN模型在精心调优后能达到与ViT模型相当的性能。尽管令人鼓舞，但设计此类高性能CNN模型仍具挑战性，需要非平凡的网络设计先验知识。为此，本文提出一种名为数学架构设计（DeepMAD）的新型框架，以原理性方式设计高性能CNN模型。在DeepMAD中，CNN网络被建模为一个信息处理系统，其表达能力和有效性可通过结构参数进行解析式表述。随后，提出一个约束数学规划（MP）问题来优化这些结构参数。该MP问题可借助现成的MP求解器在CPU上轻松求解，且内存占用极小。此外，DeepMAD是一个纯数学框架：网络设计过程中无需GPU或训练数据。DeepMAD的优越性已在多个大规模计算机视觉基准数据集上得到验证。值得注意的是，在ImageNet-1k数据集上，仅使用传统卷积层，DeepMAD在Tiny级别相比ConvNeXt和Swin分别提升0.7%和1.5%的top-1准确率，在Small级别分别提升0.8%和0.9%。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

专知会员服务

44+阅读 · 2020年3月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【ICLR-2020】网络反卷积，NETWORK DECONVOLUTION

专知会员服务

39+阅读 · 2020年2月21日