Sparse joint shift (SJS) was recently proposed as a tractable model for general dataset shift which may cause changes to the marginal distributions of features and labels as well as the posterior probabilities and the class-conditional feature distributions. Fitting SJS for a target dataset without label observations may produce valid predictions of labels and estimates of class prior probabilities. We present new results on the transmission of SJS from sets of features to larger sets of features, a conditional correction formula for the class posterior probabilities under the target distribution, identifiability of SJS, and the relationship between SJS and covariate shift. In addition, we point out inconsistencies in the algorithms which were proposed for estimating the characteristics of SJS, as they could hamper the search for optimal solutions, and suggest potential improvements.
翻译:稀疏联合偏移(Sparse Joint Shift, SJS)是最近提出的一种可处理通用数据集偏移的模型,该模型可能引起特征和标签的边缘分布、后验概率及类别条件特征分布的变化。在无标签观测的目标数据集上拟合SJS,可对标签进行有效预测并估计类别先验概率。本文提出了关于SJS从特征集到更大特征集传递的新结果、目标分布下类别后验概率的条件修正公式、SJS的可识别性及SJS与协变量偏移的关系。此外,我们指出了用于估计SJS特征的算法中存在的不一致性(这些不一致性可能阻碍最优解的搜索),并提出了潜在改进方案。