Behavioral patterns captured in embeddings learned from interaction data are pivotal across various stages of production recommender systems. However, in the initial retrieval stage, practitioners face an inherent tradeoff between embedding expressiveness and the scalability and latency of serving components, resulting in the need for representations that are both compact and expressive. To address this challenge, we propose a training strategy for learning high-dimensional sparse embedding layers in place of conventional dense ones, balancing efficiency, representational expressiveness, and interpretability. To demonstrate our approach, we modified the production-grade collaborative filtering autoencoder ELSA, achieving up to 10x reduction in embedding size with no loss of recommendation accuracy, and up to 100x reduction with only a 2.5% loss. Moreover, the active embedding dimensions reveal an interpretable inverted-index structure that segments items in a way directly aligned with the model's latent space, thereby enabling integration of segment-level recommendation functionality (e.g., 2D homepage layouts) within the candidate retrieval model itself. Source codes, additional results, as well as a live demo are available at https://github.com/zombak79/compressed_elsa
翻译:从交互数据中学习得到的嵌入所捕获的行为模式,在生产推荐系统的各个阶段都至关重要。然而,在初始检索阶段,从业者面临着嵌入表达能力与服务组件的可扩展性及延迟之间的固有权衡,这导致需要既紧凑又具有表达能力的表示。为应对这一挑战,我们提出了一种训练策略,用于学习高维稀疏嵌入层以替代传统的密集嵌入层,从而在效率、表示能力和可解释性之间取得平衡。为验证我们的方法,我们修改了生产级协同过滤自编码器ELSA,实现了嵌入大小最多减少10倍而推荐精度无损失,以及最多减少100倍而精度仅损失2.5%。此外,活跃的嵌入维度揭示了一种可解释的倒排索引结构,该结构以一种与模型潜在空间直接对齐的方式对物品进行分割,从而使得候选检索模型本身能够集成分段级别的推荐功能(例如,二维主页布局)。源代码、额外结果以及在线演示可在 https://github.com/zombak79/compressed_elsa 获取。