Capsule Networks face a critical problem in computer vision in the sense that the image background can challenge its performance, although they learn very well on training data. In this work, we propose to improve Capsule Networks' architecture by replacing the Standard Convolution with a Depthwise Separable Convolution. This new design significantly reduces the model's total parameters while increases stability and offers competitive accuracy. In addition, the proposed model on $64\times64$ pixel images outperforms standard models on $32\times32$ and $64\times64$ pixel images. Moreover, we empirically evaluate these models with Deep Learning architectures using state-of-the-art Transfer Learning networks such as Inception V3 and MobileNet V1. The results show that Capsule Networks can perform comparably against Deep Learning models. To the best of our knowledge, we believe that this is the first work on the integration of Depthwise Separable Convolution into Capsule Networks.
翻译:胶囊网络在计算机视觉中面临一个关键问题:尽管其在训练数据上表现出色,但图像背景仍会对其性能构成挑战。本文提出通过使用深度可分离卷积取代标准卷积来改进胶囊网络架构。这种新设计在显著减少模型总参数的同时,增强了稳定性并提供了具有竞争力的准确率。此外,所提出的模型在64×64像素图像上的表现优于标准模型在32×32和64×64像素图像上的表现。我们进一步使用包含Inception V3和MobileNet V1等前沿迁移学习网络的深度学习架构对这些模型进行实证评估。结果表明,胶囊网络能够达到与深度学习模型相当的性能。据我们所知,这是首次将深度可分离卷积整合到胶囊网络中的研究工作。