As the use of online platforms continues to grow across all demographics, users often express a desire to feel represented in the content. To improve representation in search results and recommendations, we introduce end-to-end diversification, ensuring that diverse content flows throughout the various stages of these systems, from retrieval to ranking. We develop, experiment, and deploy scalable diversification mechanisms in multiple production surfaces on the Pinterest platform, including Search, Related Products, and New User Homefeed, to improve the representation of different skin tones in beauty and fashion content. Diversification in production systems includes three components: identifying requests that will trigger diversification, ensuring diverse content is retrieved from the large content corpus during the retrieval stage, and finally, balancing the diversity-utility trade-off in a self-adjusting manner in the ranking stage. Our approaches, which evolved from using Strong-OR logical operator to bucketized retrieval at the retrieval stage and from greedy re-rankers to multi-objective optimization using determinantal point processes for the ranking stage, balances diversity and utility while enabling fast iterations and scalable expansion to diversification over multiple dimensions. Our experiments indicate that these approaches significantly improve diversity metrics, with a neutral to a positive impact on utility metrics and improved user satisfaction, both qualitatively and quantitatively, in production.
翻译:随着在线平台在各个人群中的持续普及,用户常表达希望在内容中获得自我呈现的诉求。为提升搜索结果与推荐中的代表性,我们提出端到端多样化方法,确保从检索到排序的各系统阶段中均能呈现多样性内容。我们在Pinterest平台的多个生产场景(包括搜索、相关产品及新用户首页信息流)中设计、实验并部署了可扩展的多样化机制,以提升美妆与时尚内容中不同肤色的代表性。生产系统中的多样化包含三个组件:识别需要触发多样化的请求、确保在检索阶段从海量内容库中获取多样性内容,以及在排序阶段通过自适应方式平衡多样化与效用的权衡。我们的方法从检索阶段采用强或逻辑运算符发展为分桶检索,排序阶段则从贪婪重排序器演进为基于行列式点过程的多目标优化,在平衡多样性与效用的同时支持快速迭代及多维度的可扩展多样化。实验表明,这些方法显著提升了多样性指标,对效用指标产生中性至积极影响,并在生产环境中从定性与定量两方面提升了用户满意度。