Sparse joint shift (SJS) was recently proposed as a tractable model for general dataset shift which may cause changes to the marginal distributions of features and labels as well as the posterior probabilities and the class-conditional feature distributions. Fitting SJS for a target dataset without label observations may produce valid predictions of labels and estimates of class prior probabilities. We present new results on the transmission of SJS from sets of features to larger sets of features, a conditional correction formula for the class posterior probabilities under the target distribution, identifiability of SJS, and the relationship between SJS and covariate shift. In addition, we point out inconsistencies in the algorithms which were proposed for estimating the characteristics of SJS, as they could hamper the search for optimal solutions.
翻译:稀疏联合转移(Sparse joint shift, SJS)近期被提出作为一种可处理的通用数据集偏移模型,该模型能刻画特征与标签边际分布、后验概率及类条件特征分布的变化。在无标签观测的目标数据集上拟合SJS,可生成有效的标签预测结果并估计类先验概率。我们提出了关于SJS的新结果,涵盖:特征集到更大特征集的传递机制、目标分布下类后验概率的条件校正公式、SJS的可辨识性,以及SJS与协变量偏移的关系。此外,我们指出已有SJS特征估计算法中存在的矛盾——这些矛盾可能阻碍最优解的搜索过程。