Self-supervised foundation models for digital pathology encode small patches from H\&E whole slide images into latent representations used for downstream tasks. However, the invariance of these representations to patch rotation remains unexplored. This study investigates the rotational invariance of latent representations across twelve foundation models by quantifying the alignment between non-rotated and rotated patches using mutual $k$-nearest neighbours and cosine distance. Models that incorporated rotation augmentation during self-supervised training exhibited significantly greater invariance to rotations. We hypothesise that the absence of rotational inductive bias in the transformer architecture necessitates rotation augmentation during training to achieve learned invariance. Code: https://github.com/MatousE/rot-invariance-analysis.
翻译:用于数字病理学的自监督基础模型将H&E全切片图像中的小块编码为用于下游任务的潜在表征。然而,这些表征对图像块旋转的不变性仍未得到充分探索。本研究通过互k近邻和余弦距离量化未旋转与旋转图像块之间的对齐度,从而调查了十二种基础模型的潜在表征的旋转不变性。在自监督训练期间采用了旋转增强的模型表现出显著更强的旋转不变性。我们假设,Transformer架构中缺乏旋转归纳偏置,因此需要在训练期间进行旋转增强以实现学习到的不变性。代码:https://github.com/MatousE/rot-invariance-analysis。