This paper presents a case study on deploying Large Language Models (LLMs) as an advanced "annotation" mechanism to achieve nuanced content understanding (e.g., discerning content "vibe") at scale within a large-scale industrial short-form video recommendation system. Traditional machine learning classifiers for content understanding face protracted development cycles and a lack of deep, nuanced comprehension. The "LLM-as-annotators" approach addresses these by significantly shortening development times and enabling the annotation of subtle attributes. This work details an end-to-end workflow encompassing: (1) iterative definition and robust evaluation of target attributes, refined by offline metrics and online A/B testing; (2) scalable offline bulk annotation of video corpora using LLMs with multimodal features, optimized inference, and knowledge distillation for broad application; and (3) integration of these rich annotations into the online recommendation serving system, for example, through personalized restrict retrieval. Experimental results demonstrate the efficacy of this approach, with LLMs outperforming human raters in offline annotation quality for nuanced attributes and yielding significant improvements of user participation and satisfied consumption in online A/B tests. The study provides insights into designing and scaling production-level LLM pipelines for rich content evaluation, highlighting the adaptability and benefits of LLM-generated nuanced understanding for enhancing content discovery, user satisfaction, and the overall effectiveness of modern recommendation systems.
翻译:本文通过一项案例研究,探讨了在大规模工业级短视频推荐系统中部署大语言模型(LLMs)作为一种先进的“标注”机制,以实现对内容(例如,辨别内容的“氛围”)的细粒度、规模化理解。传统用于内容理解的机器学习分类器面临开发周期长、缺乏深度且细致入微的理解能力等挑战。采用“LLM作为标注器”的方法通过显著缩短开发时间并实现对微妙属性的标注,有效应对了这些问题。本研究详细阐述了一个端到端的工作流程,包括:(1)目标属性的迭代定义与鲁棒性评估,通过离线指标和在线A/B测试进行优化;(2)利用LLMs结合多模态特征、优化推理以及知识蒸馏技术,对视频语料库进行可扩展的离线批量标注,以实现广泛应用;(3)将这些丰富的标注信息集成到在线推荐服务系统中,例如通过个性化限制检索等方式。实验结果表明了该方法的有效性:在离线标注质量方面,LLMs在细粒度属性上表现优于人工评分员;在线A/B测试中,该方法显著提升了用户参与度和满意消费。本研究为设计和扩展用于丰富内容评估的生产级LLM流程提供了见解,强调了LLM生成的细粒度理解在增强内容发现、用户满意度以及现代推荐系统整体效能方面的适应性和优势。