The phenomenon of distribution shift (DS) occurs when a dataset at test time differs from the dataset at training time, which can significantly impair the performance of a machine learning model in practical settings due to a lack of knowledge about the data's distribution at test time. To address this problem, we postulate that real-world distributions are composed of latent Invariant Elementary Distributions (I.E.D) across different domains. This assumption implies an invariant structure in the solution space that enables knowledge transfer to unseen domains. To exploit this property for domain generalization, we introduce a modular neural network layer consisting of Gated Domain Units (GDUs) that learn a representation for each latent elementary distribution. During inference, a weighted ensemble of learning machines can be created by comparing new observations with the representations of each elementary distribution. Our flexible framework also accommodates scenarios where explicit domain information is not present. Extensive experiments on image, text, and graph data show consistent performance improvement on out-of-training target domains. These findings support the practicality of the I.E.D assumption and the effectiveness of GDUs for domain generalisation.
翻译:分布偏移(DS)现象是指测试时的数据集与训练时的数据集存在差异,由于缺乏对测试时数据分布的了解,这在实际场景中会显著降低机器学习模型的性能。为解决这一问题,我们假设现实世界中的分布由不同域中潜在的隐式不变基本分布(I.E.D)构成。这一假设意味着解空间中存在一种不变结构,使得知识能够迁移到未见过的域。为了利用这一性质实现域泛化,我们引入了一种模块化神经网络层,包含门控域单元(GDUs),用于学习每个隐式基本分布的表示。在推理阶段,通过将新观测值与每个基本分布的表示进行比较,可以构建加权集成的学习机器。我们的灵活框架也适用于没有明确域信息的场景。在图像、文本和图数据上的大量实验表明,该方法在训练集外的目标域上持续提升了性能。这些结果支持了I.E.D假设的实用性以及GDUs在域泛化中的有效性。