Ratings are frequently used to evaluate and compare subjects in various applications, from education to healthcare, because ratings provide succinct yet credible measures for comparing subjects. However, when multiple rating lists are combined or considered together, subjects often have missing ratings, because most rating lists do not rate every subject in the combined list. In this study, we propose analyses on missing value patterns using six real-world data sets in various applications, as well as the conditions for applicability of imputation algorithms. Based on the special structures and properties derived from the analyses, we propose optimization models and algorithms that minimize the total rating discordance across rating providers to impute missing ratings in the combined rating lists, using only the known rating information. The total rating discordance is defined as the sum of the pairwise discordance metric, which can be written as a quadratic function. Computational experiments based on real-world and synthetic rating data sets show that the proposed methods outperform the state-of-the-art general imputation methods in the literature in terms of imputation accuracy.
翻译:评分被广泛应用于从教育到医疗等各个领域对主体进行评估和比较,因为评分提供了简洁且可靠的衡量标准。然而,当多个评分列表被合并或共同考虑时,主体往往存在缺失评分,因为大多数评分列表并未对合并列表中的每个主体进行评分。在本研究中,我们利用六个不同应用领域的真实数据集,对缺失值模式进行了分析,并探讨了插补算法的适用条件。基于从分析中得出的特殊结构和性质,我们提出了优化模型和算法,这些模型和算法通过最小化评分提供者之间的总评分不一致性来插补合并评分列表中的缺失值,且仅使用已知的评分信息。总评分不一致性被定义为成对不一致性指标的总和,该指标可表示为二次函数。基于真实和合成评分数据集的实验表明,所提出的方法在插补准确性方面优于文献中现有的通用插补方法。