Capsule networks are a type of neural network that identify image parts and form the instantiation parameters of a whole hierarchically. The goal behind the network is to perform an inverse computer graphics task, and the network parameters are the mapping weights that transform parts into a whole. The trainability of capsule networks in complex data with high intra-class or intra-part variation is challenging. This paper presents a multi-prototype architecture for guiding capsule networks to represent the variations in the image parts. To this end, instead of considering a single capsule for each class and part, the proposed method employs several capsules (co-group capsules), capturing multiple prototypes of an object. In the final layer, co-group capsules compete, and their soft output is considered the target for a competitive cross-entropy loss. Moreover, in the middle layers, the most active capsules map to the next layer with a shared weight among the co-groups. Consequently, due to the reduction in parameters, implicit weight-sharing makes it possible to have more deep capsule network layers. The experimental results on MNIST, SVHN, C-Cube, CEDAR, MCYT, and UTSig datasets reveal that the proposed model outperforms others regarding image classification accuracy.
翻译:胶囊网络是一种识别图像部件并层次化形成整体实例化参数的神经网络。其网络目标在于执行逆计算机图形学任务,网络参数是将部件转换为整体的映射权重。在具有高类内或部件内变异性的复杂数据中,胶囊网络的可训练性具有挑战性。本文提出一种多原型架构,用于指导胶囊网络表示图像部件中的变异性。为此,所提方法并非为每个类别和部件仅使用单一胶囊,而是采用多个胶囊(共组胶囊)来捕获对象的多个原型。在最后一层中,共组胶囊相互竞争,其软输出被视为竞争性交叉熵损失的目标。此外,在中间层中,最活跃的胶囊通过共组间的共享权重映射到下一层。因此,由于参数减少,隐式权重共享使得构建更深的胶囊网络层成为可能。在MNIST、SVHN、C-Cube、CEDAR、MCYT和UTSig数据集上的实验结果表明,所提模型在图像分类准确率方面优于其他模型。