Test-time adaptation (TTA) refers to adapting a classifier for the test data when the probability distribution of the test data slightly differs from that of the training data of the model. To the best of our knowledge, most of the existing TTA approaches modify the weights of the classifier relying heavily on the architecture. It is unclear as to how these approaches are extendable to generic architectures. In this article, we propose an architecture-agnostic approach to TTA by adding an adapter network pre-processing the input images suitable to the classifier. This adapter is trained using the proposed quantile loss. Unlike existing approaches, we correct for the distribution shift by matching high-dimensional geometric quantiles. We prove theoretically that under suitable conditions minimizing quantile loss can learn the optimal adapter. We validate our approach on CIFAR10-C, CIFAR100-C and TinyImageNet-C by training both classic convolutional and transformer networks on CIFAR10, CIFAR100 and TinyImageNet datasets.
翻译:测试时适应(TTA)指的是当测试数据的概率分布与模型训练数据的分布略有差异时,对分类器进行适应调整。据我们所知,现有的大多数TTA方法依赖于具体架构来修改分类器的权重,这些方法如何扩展到通用架构尚不明确。本文提出一种与架构无关的TTA方法,通过添加一个适配器网络对输入图像进行预处理,使其适用于分类器。该适配器使用我们提出的分位数损失进行训练。与现有方法不同,我们通过匹配高维几何分位数来校正分布偏移。我们从理论上证明,在适当条件下最小化分位数损失可以学习到最优适配器。我们在CIFAR10-C、CIFAR100-C和TinyImageNet-C数据集上验证了该方法,使用在CIFAR10、CIFAR100和TinyImageNet数据集上训练的经典卷积网络和Transformer网络进行了实验。