Matrix factorization (MF) is a simple collaborative filtering technique that achieves superior recommendation accuracy by decomposing the user-item interaction matrix into user and item latent matrices. Because the model typically learns each interaction independently, it may overlook the underlying shared dependencies between users and items, resulting in less stable and interpretable recommendations. Based on these insights, we propose "Hierarchical Matrix Factorization" (HMF), which incorporates clustering concepts to capture the hierarchy, where leaf nodes and other nodes correspond to users/items and clusters, respectively. Central to our approach, called hierarchical embeddings, is the additional decomposition of the latent matrices (embeddings) into probabilistic connection matrices, which link the hierarchy, and a root cluster latent matrix. The embeddings are differentiable, allowing simultaneous learning of interactions and clustering using a single gradient descent method. Furthermore, the obtained cluster-specific interactions naturally summarize user-item interactions and provide interpretability. Experimental results on ratings and ranking predictions show that HMF outperforms existing MF methods, in particular achieving a 1.37 point improvement in RMSE for sparse interactions. Additionally, it was confirmed that the clustering integration of HMF has the potential for faster learning convergence and mitigation of overfitting compared to MF, and also provides interpretability through a cluster-centered case study.
翻译:矩阵分解(MF)是一种简单的协同过滤技术,通过将用户-物品交互矩阵分解为用户隐矩阵和物品隐矩阵,实现优异的推荐精度。由于该模型通常独立学习每个交互,可能忽略用户与物品间潜在的共享依赖关系,导致推荐结果稳定性与可解释性不足。基于此,我们提出"分层矩阵分解"(HMF),该方法融入聚类思想以捕获层次结构:叶节点对应用户/物品,其他节点对应聚类。核心方法名为分层嵌入,通过将隐矩阵(嵌入)进一步分解为连接层次结构的概率连接矩阵与根聚类隐矩阵。该嵌入具有可微分性,可通过单次梯度下降法同步学习交互与聚类。此外,获得的聚类特定交互自然归纳了用户-物品交互模式,并赋予模型可解释性。评分预测与排序预测的实验结果表明,HMF优于现有MF方法,尤其在稀疏交互场景下RMSE提升1.37点。同时证实,相较于MF,HMF的聚类集成具有加速收敛与缓解过拟合的潜力,并通过以聚类为中心的案例研究展现了其可解释性。