Sparse joint shift (SJS) was recently proposed as a tractable model for general dataset shift which may cause changes to the marginal distributions of features and labels as well as the posterior probabilities and the class-conditional feature distributions. Fitting SJS for a target dataset without label observations may produce valid predictions of labels and estimates of class prior probabilities. We present new results on the transmission of SJS from sets of features to larger sets of features, a conditional correction formula for the class posterior probabilities under the target distribution, identifiability of SJS, and the relationship between SJS and covariate shift. In addition, we point out inconsistencies in the algorithms which were proposed for estimating the characteristics of SJS, as they could hamper the search for optimal solutions.
翻译:稀疏联合偏移(Sparse Joint Shift,SJS)最近被提出作为一种通用的数据集偏移可处理模型,该模型可能导致特征和标签的边缘分布、后验概率以及类别条件特征分布发生变化。在无标签观测的目标数据集上拟合SJS,可产生有效的标签预测和类别先验概率估计。我们提出了关于SJS从特征集向更大特征集传递、目标分布下类别后验概率的条件校正公式、SJS的可识别性以及SJS与协变量偏移之间关系的新结果。此外,我们指出了用于估计SJS特征的算法中存在的不一致性,这些不一致性可能阻碍最优解的搜索。