Self-supervised representation learning in computer vision relies heavily on hand-crafted image transformations to learn meaningful and invariant features. However few extensive explorations of the impact of transformation design have been conducted in the literature. In particular, the dependence of downstream performances to transformation design has been established, but not studied in depth. In this work, we explore this relationship, its impact on a domain other than natural images, and show that designing the transformations can be viewed as a form of supervision. First, we demonstrate that not only do transformations have an effect on downstream performance and relevance of clustering, but also that each category in a supervised dataset can be impacted in a different way. Following this, we explore the impact of transformation design on microscopy images, a domain where the difference between classes is more subtle and fuzzy than in natural images. In this case, we observe a greater impact on downstream tasks performances. Finally, we demonstrate that transformation design can be leveraged as a form of supervision, as careful selection of these by a domain expert can lead to a drastic increase in performance on a given downstream task.
翻译:计算机视觉领域的自监督表示学习严重依赖手工设计的图像变换来学习有意义且具有不变性的特征。然而,现有文献中鲜有对变换设计影响的深入探索。尽管下游任务性能对变换设计的依赖性已被证实,但尚未得到充分研究。在本工作中,我们探究了这种关系及其对自然图像之外领域的影响,并表明变换设计可被视为一种监督形式。首先,我们不仅证明了变换会影响下游任务性能和聚类相关性,还发现监督数据集中的每个类别可能以不同方式受到这种影响。在此基础上,我们进一步探究了变换设计对显微镜图像的影响——该领域中类别间的差异比自然图像更为微妙模糊。在此情况下,我们观察到其对下游任务性能的影响更为显著。最终,我们证明变换设计可被利用为一种监督形式:通过领域专家对变换的精心选择,能够在特定下游任务上实现性能的显著提升。