Machine learning models often fail on out-of-distribution (OOD) samples. Visual prompts emerge as a light-weight adaptation method in input space for large-scale vision models. Existing vision prompts optimize a high-dimensional additive vector and require labeled data on training. However, we find this paradigm fails on test-time adaptation when labeled data is unavailable, where the high-dimensional visual prompt overfits to the self-supervised objective. We present convolutional visual prompts for test-time adaptation without labels. Our convolutional prompt is structured and requires fewer trainable parameters (less than 1 % parameters of standard visual prompts). Extensive experiments on a wide variety of OOD recognition tasks show that our approach is effective, improving robustness by up to 5.87 % over a number of large-scale model architectures.
翻译:机器学习模型通常会在分布外样本上失败。视觉提示作为一种轻量级自适应方法,在输入空间中对大规模视觉模型进行适配。现有视觉提示通过优化高维加性向量,并在训练过程中依赖标注数据。然而,我们发现,当标注数据不可用于测试时自适应时,这种范式会失效——高维视觉提示会在自监督目标上过拟合。为此,我们提出了一种无需标注数据的测试时自适应卷积视觉提示方法。我们的卷积提示具有结构化特性,且所需可训练参数更少(不到标准视觉提示参数的1%)。在多种分布外识别任务上的大量实验表明,该方法效果显著,相较于多种大规模模型架构,鲁棒性提升最高达5.87%。