Factorizable joint shift (FJS) was proposed as a type of distribution shift (or dataset shift) that comprises both covariate and label shift. Recently, it has been observed that FJS actually arises from consecutive label and covariate (or vice versa) shifts. Research into FJS so far has been confined to the case of categorical label spaces. We propose a framework for analysing distribution shift in the case of general label spaces, thus covering both classification and regression models. Based on the framework, we generalise existing results on FJS to general label spaces and propose a related extension of the expectation maximisation (EM) algorithm for class prior probabilities. We also take a fresh look at generalized label shift (GLS) in the case of general label spaces.
翻译:可分解联合偏移(FJS)作为一种同时包含协变量偏移与标签偏移的分布偏移(或数据集偏移)类型被提出。近来有研究指出,FJS实际上源于连续的标签偏移与协变量偏移(或反之)。目前关于FJS的研究仅限于分类标签空间的情形。本文提出一个分析一般标签空间下分布偏移的框架,从而涵盖分类与回归模型。基于该框架,我们将现有关于FJS的结果推广至一般标签空间,并提出了类别先验概率期望最大化(EM)算法的相关扩展。此外,我们重新审视了一般标签空间下的广义标签偏移(GLS)。