Self-supervised learning (SSL) using mixed images has been studied to learn various image representations. Existing methods using mixed images learn a representation by maximizing the similarity between the representation of the mixed image and the synthesized representation of the original images. However, few methods consider the synthesis of representations from the perspective of mathematical logic. In this study, we focused on a synthesis method of representations. We proposed a new SSL with mixed images and a new representation format based on many-valued logic. This format can indicate the feature-possession degree, that is, how much of each image feature is possessed by a representation. This representation format and representation synthesis by logic operation realize that the synthesized representation preserves the remarkable characteristics of the original representations. Our method performed competitively with previous representation synthesis methods for image classification tasks. We also examined the relationship between the feature-possession degree and the number of classes of images in the multilabel image classification dataset to verify that the intended learning was achieved. In addition, we discussed image retrieval, which is an application of our proposed representation format using many-valued logic.
翻译:利用混合图像的自监督学习已被广泛研究,以学习多样化的图像表征。现有混合图像方法通过最大化混合图像表征与原始图像合成表征之间的相似性来学习表征,但鲜有方法从数学逻辑角度考虑表征的合成。本研究聚焦于表征的合成方法,提出了一种结合混合图像的新型自监督学习框架,并基于多值逻辑设计了新的表征格式。该格式能够表征特征拥有度(即表征对每个图像特征的持有程度)。通过这种表征格式与逻辑运算实现的表征合成,可确保合成表征保留原始表征的显著特性。在图像分类任务中,我们的方法与现有表征合成方法相比具有竞争力。为验证预期学习效果,我们进一步探究了多标签图像分类数据集中特征拥有度与图像类别数之间的关系。此外,我们探讨了图像检索这一基于多值逻辑的表征格式的潜在应用场景。