OneCAD: One Classifier for All image Datasets using multimodal learning

Vision-Transformers (ViTs) and Convolutional neural networks (CNNs) are widely used Deep Neural Networks (DNNs) for classification task. These model architectures are dependent on the number of classes in the dataset it was trained on. Any change in number of classes leads to change (partial or full) in the model's architecture. This work addresses the question: Is it possible to create a number-of-class-agnostic model architecture?. This allows model's architecture to be independent of the dataset it is trained on. This work highlights the issues with the current architectures (ViTs and CNNs). Also, proposes a training and inference framework OneCAD (One Classifier for All image Datasets) to achieve close-to number-of-class-agnostic transformer model. To best of our knowledge this is the first work to use Mask-Image-Modeling (MIM) with multimodal learning for classification task to create a DNN model architecture agnostic to the number of classes. Preliminary results are shown on natural and medical image datasets. Datasets: MNIST, CIFAR10, CIFAR100 and COVIDx. Code will soon be publicly available on github.

翻译：视觉变换器（Vision-Transformers, ViTs）和卷积神经网络（Convolutional Neural Networks, CNNs）是分类任务中广泛使用的深度神经网络（Deep Neural Networks, DNNs）。这些模型架构依赖于其训练数据集的类别数量。类别数量的任何变化都会导致模型架构的（部分或全部）变更。本研究探讨以下问题：是否可能构建一种对类别数量无关的模型架构？这将使模型架构独立于其训练数据集。本文指出了当前架构（ViTs和CNNs）存在的问题，并提出了一种训练与推理框架OneCAD（全图像数据集统一分类器），以实现接近类别数量无关的变换器模型。据我们所知，这是首次将掩码图像建模（Mask-Image-Modeling, MIM）与多模态学习结合用于分类任务，以构建对类别数量无关的DNN模型架构。初步结果在自然图像和医学图像数据集（MNIST、CIFAR10、CIFAR100及COVIDx）上进行了展示。代码将很快在github上公开。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

图像分类半监督自监督无监督学习综述，A survey on Semi-, Self- and Unsupervised Learning for Image Classification

专知会员服务

46+阅读 · 2020年7月29日

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日