Domain generalization (DG) is a principal task to evaluate the robustness of computer vision models. Many previous studies have used normalization for DG. In normalization, statistics and normalized features are regarded as style and content, respectively. However, it has a content variation problem when removing style because the boundary between content and style is unclear. This study addresses this problem from the frequency domain perspective, where amplitude and phase are considered as style and content, respectively. First, we verify the quantitative phase variation of normalization through the mathematical derivation of the Fourier transform formula. Then, based on this, we propose a novel normalization method, PCNorm, which eliminates style only as the preserving content through spectral decomposition. Furthermore, we propose advanced PCNorm variants, CCNorm and SCNorm, which adjust the degrees of variations in content and style, respectively. Thus, they can learn domain-agnostic representations for DG. With the normalization methods, we propose ResNet-variant models, DAC-P and DAC-SC, which are robust to the domain gap. The proposed models outperform other recent DG methods. The DAC-SC achieves an average state-of-the-art performance of 65.6% on five datasets: PACS, VLCS, Office-Home, DomainNet, and TerraIncognita.
翻译:领域泛化(DG)是评估计算机视觉模型鲁棒性的核心任务。先前研究多采用归一化方法处理领域泛化问题,其中统计特征与归一化特征分别被视为风格特征与内容特征。然而,由于内容与风格的界限模糊,传统归一化方法在去除风格特征时会产生内容变异问题。本研究从频域视角切入该问题,将振幅与相位分别视为风格特征与内容特征。首先,通过傅里叶变换公式的数学推导,定量验证了归一化过程中的相位变化。在此基础上,提出新型归一化方法PCNorm,该方法通过谱分解仅去除风格特征而保留内容特征。进一步,我们提出PCNorm的进阶变体CCNorm与SCNorm,分别调节内容与风格特征的变化程度,从而学习领域无关表征以实现领域泛化。基于上述归一化方法,我们提出ResNet变体模型DAC-P与DAC-SC,其对领域差异具有鲁棒性。所提模型性能超越现有领域泛化方法,其中DAC-SC在PACS、VLCS、Office-Home、DomainNet与TerraIncognita五个数据集上达到平均65.6%的最新最优性能。