The performance of a Collaborative Filtering (CF) method is based on the properties of a User-Item Rating Matrix (URM). And the properties or Rating Data Characteristics (RDC) of a URM are constantly changing. Recent studies significantly explained the variation in the performances of CF methods resulted due to the change in URM using six or more RDC. Here, we found that the significant proportion of variation in the performances of different CF techniques can be accounted to two RDC only. The two RDC are the number of ratings per user or Information per User (IpU) and the number of ratings per item or Information per Item (IpI). And the performances of CF algorithms are quadratic to IpU (or IpI) for a square URM. The findings of this study are based on seven well-established CF methods and three popular public recommender datasets: 1M MovieLens, 25M MovieLens, and Yahoo! Music Rating datasets
翻译:协同过滤(CF)方法的性能基于用户-项目评分矩阵(URM)的属性,而URM的评分数据特性(RDC)不断变化。近期研究利用六个或更多RDC显著解释了因URM变化导致的CF方法性能差异。本研究发现,不同CF技术性能差异的显著比例仅可归因于两个RDC:每个用户的评分数量(即每位用户的信息量IpU)和每个项目的评分数量(即每项的信息量IpI)。对于方阵URM,CF算法的性能与IpU(或IpI)呈二次关系。本研究的发现基于七种成熟的CF方法及三个流行的公开推荐数据集:1M MovieLens、25M MovieLens和Yahoo! Music Rating数据集。