The embedding-based architecture has become the dominant approach in modern recommender systems, mapping users and items into a compact vector space. It then employs predefined similarity metrics, such as the inner product, to calculate similarity scores between user and item embeddings, thereby guiding the recommendation of items that align closely with a user's preferences. Given the critical role of similarity metrics in recommender systems, existing methods mainly employ handcrafted similarity metrics to capture the complex characteristics of user-item interactions. Yet, handcrafted metrics may not fully capture the diverse range of similarity patterns that can significantly vary across different domains. To address this issue, we propose an Automated Similarity Metric Generation method for recommendations, named AutoSMG, which can generate tailored similarity metrics for various domains and datasets. Specifically, we first construct a similarity metric space by sampling from a set of basic embedding operators, which are then integrated into computational graphs to represent metrics. We employ an evolutionary algorithm to search for the optimal metrics within this metric space iteratively. To improve search efficiency, we utilize an early stopping strategy and a surrogate model to approximate the performance of candidate metrics instead of fully training models. Notably, our proposed method is model-agnostic, which can seamlessly plugin into different recommendation model architectures. The proposed method is validated on three public recommendation datasets across various domains in the Top-K recommendation task, and experimental results demonstrate that AutoSMG outperforms both commonly used handcrafted metrics and those generated by other search strategies.
翻译:基于嵌入架构已成为现代推荐系统的主流方法,它将用户和物品映射到紧凑的向量空间中,随后采用预定义的相似性度量(如内积)计算用户与物品嵌入之间的相似性得分,从而指导推荐与用户偏好高度匹配的物品。鉴于相似性度量在推荐系统中的关键作用,现有方法主要采用人工设计的相似性度量来捕捉用户-物品交互的复杂特征。然而,人工设计的度量可能无法全面捕捉因领域差异而显著变化的多样化相似性模式。为解决这一问题,我们提出了一种面向推荐的自动化相似性度量生成方法AutoSMG,该方法能够为不同领域和数据集生成定制化的相似性度量。具体而言,我们首先通过从一组基础嵌入算子中采样构建相似性度量空间,随后将这些算子整合为计算图以表示度量。采用进化算法在该度量空间中迭代搜索最优度量。为提升搜索效率,我们利用早停策略及替代模型来近似候选度量的性能,而非进行完整训练。值得注意的是,所提方法具有模型无关性,可无缝嵌入不同的推荐模型架构。在Top-K推荐任务中,我们基于三个跨领域公开推荐数据集验证了该方法,实验结果表明AutoSMG在性能上优于常用的人工设计度量及其他搜索策略生成的度量。