Effective recommendation systems rely on capturing user preferences, often requiring incorporating numerous features such as universally unique identifiers (UUIDs) of entities. However, the exceptionally high cardinality of UUIDs poses a significant challenge in terms of model degradation and increased model size due to sparsity. This paper presents two innovative techniques to address the challenge of high cardinality in recommendation systems. Specifically, we propose a bag-of-words approach, combined with layer sharing, to substantially decrease the model size while improving performance. Our techniques were evaluated through offline and online experiments on Uber use cases, resulting in promising results demonstrating our approach's effectiveness in optimizing recommendation systems and enhancing their overall performance.
翻译:有效的推荐系统依赖于捕捉用户偏好,通常需要整合大量特征,例如实体的通用唯一标识符(UUID)。然而,UUID的超高基数会因稀疏性导致模型性能下降和模型规模增大,构成重大挑战。本文提出了两种创新技术以应对推荐系统中的高基数问题。具体而言,我们提出了一种结合层共享的词袋方法,在提升性能的同时大幅降低模型规模。通过在优步案例上进行离线和在线实验评估,我们的方法取得了显著成效,证明了其在优化推荐系统及提升整体性能方面的有效性。