Speeding Up Image Classifiers with Little Companions

Scaling up neural networks has been a key recipe to the success of large language and vision models. However, in practice, up-scaled models can be disproportionately costly in terms of computations, providing only marginal improvements in performance; for example, EfficientViT-L3-384 achieves <2% improvement on ImageNet-1K accuracy over the base L1-224 model, while requiring $14\times$ more multiply-accumulate operations (MACs). In this paper, we investigate scaling properties of popular families of neural networks for image classification, and find that scaled-up models mostly help with "difficult" samples. Decomposing the samples by difficulty, we develop a simple model-agnostic two-pass Little-Big algorithm that first uses a light-weight "little" model to make predictions of all samples, and only passes the difficult ones for the "big" model to solve. Good little companion achieve drastic MACs reduction for a wide variety of model families and scales. Without loss of accuracy or modification of existing models, our Little-Big models achieve MACs reductions of 76% for EfficientViT-L3-384, 81% for EfficientNet-B7-600, 71% for DeiT3-L-384 on ImageNet-1K. Little-Big also speeds up the InternImage-G-512 model by 62% while achieving 90% ImageNet-1K top-1 accuracy, serving both as a strong baseline and as a simple practical method for large model compression.

翻译：扩大神经网络规模一直是大型语言和视觉模型取得成功的关键方法。然而在实践中，放大后的模型在计算成本上可能不成比例地高昂，却仅带来性能的边际提升；例如，EfficientViT-L3-384在ImageNet-1K准确率上相比基础L1-224模型的提升不足2%，却需要14倍以上的乘积累加运算量。本文研究了图像分类任务中主流神经网络家族的缩放特性，发现放大后的模型主要对"困难"样本的处理有所助益。通过按样本难度进行分解，我们提出了一种简单的模型无关双阶段算法——Little-Big算法：首先使用轻量级"小"模型对所有样本进行预测，仅将困难样本交由"大"模型处理。优秀的小模型伴侣能在多种模型家族和规模下实现显著的MACs降低。在不损失精度或修改现有模型的前提下，我们的Little-Big模型在ImageNet-1K上实现了以下MACs缩减：EfficientViT-L3-384降低76%，EfficientNet-B7-600降低81%，DeiT3-L-384降低71%。Little-Big算法还将InternImage-G-512模型加速62%，同时达到90%的ImageNet-1K top-1准确率，既可作为强大的性能基准，也可作为大模型压缩的简易实用方法。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日