Neural networks do not generalize well to unseen data with domain shifts -- a longstanding problem in machine learning and AI. To overcome the problem, we propose MixStyle, a simple plug-and-play, parameter-free module that can improve domain generalization performance without the need to collect more data or increase model capacity. The design of MixStyle is simple: it mixes the feature statistics of two random instances in a single forward pass during training. The idea is grounded by the finding from recent style transfer research that feature statistics capture image style information, which essentially defines visual domains. Therefore, mixing feature statistics can be seen as an efficient way to synthesize new domains in the feature space, thus achieving data augmentation. MixStyle is easy to implement with a few lines of code, does not require modification to training objectives, and can fit a variety of learning paradigms including supervised domain generalization, semi-supervised domain generalization, and unsupervised domain adaptation. Our experiments show that MixStyle can significantly boost out-of-distribution generalization performance across a wide range of tasks including image recognition, instance retrieval and reinforcement learning.
翻译:神经网络难以泛化至存在域偏移的未见数据——此乃机器学习和人工智能领域的长期难题。为解决该问题,本文提出MixStyle,一种即插即用、无需参数的模块,可在不增加数据量或模型容量的前提下提升域泛化性能。MixStyle的设计极为简洁:在训练阶段的单次前向传播中混合两个随机样本的特征统计量。其理论基础源自近期风格迁移研究的发现:特征统计量可捕获图像风格信息——这正是定义视觉域的关键。因此,混合特征统计量可视为在特征空间中高效合成新域的方式,从而实现数据增强。MixStyle仅需少量代码即可轻松实现,无需修改训练目标函数,并适用于包括监督域泛化、半监督域泛化及无监督域适应在内的多种学习范式。实验表明,MixStyle能在图像识别、实例检索及强化学习等广泛任务中显著提升分布外泛化性能。