Item-to-Item (I2I) retrieval is a fundamental part of modern content platforms, supporting critical industrial workflows from recommendation engines to content auditing. While multimodal embedding methods have advanced general retrieval, they often falter in I2I scenarios due to the challenges of balancing global content representation with fine-grained local retrieval, the systemic inefficiency of decoupled embedding-and-ranking pipelines, and the inherent trade-offs between model precision and serving latency. To solve these issues, we propose \textbf{UniNote}, a unified embedding model designed for industrial I2I retrieval. Tailored retrieval strategies are introduced to support representation learning over complex, multimodal content at varying granularities. To operationalize these strategies, UniNote employs a two-stage training paradigm: the first stage leverages contrastive SFT to establish robust base embeddings, while the second stage refines ranking quality through a reinforcement learning (RL) process that aligns the model with content relevance. Our results show that UniNote achieves SOTA performance across diverse I2I tasks. Deployed at Xiaohongshu and integrated with Matryoshka Representation Learning (MRL), UniNote achieved significant improvements in retrieval quality and cost efficiency in large-scale applications.
翻译:项目对项目(I2I)检索是现代内容平台的基础环节,支撑着从推荐引擎到内容审核等关键工业流程。尽管多模态嵌入方法推动了通用检索的进步,但在I2I场景中,由于全局内容表示与细粒度局部检索之间的平衡难题、解耦式嵌入-排序管线的系统性低效问题,以及模型精度与推理延迟之间的固有折衷,这些方法往往表现欠佳。为解决上述问题,我们提出**UniNote**——一种专为工业级I2I检索设计的统一嵌入模型。我们引入定制化检索策略,以支持对复杂多模态内容在不同粒度层面的表示学习。为实现这些策略,UniNote采用两阶段训练范式:第一阶段利用对比监督微调(SFT)建立稳健的基座嵌入;第二阶段通过强化学习(RL)流程优化排序质量,使模型与内容相关性对齐。实验结果表明,UniNote在多种I2I任务中均实现了最先进的性能。该模型已部署于小红书,并与俄罗斯套娃表示学习(MRL)技术集成,在大规模应用中显著提升了检索质量与成本效率。