Recommender systems have become an integral part of online platforms, providing personalized suggestions for purchasing items, consuming contents, and connecting with individuals. An online recommender system consists of two sides of components: the producer side comprises product sellers, content creators, or service providers, etc., and the consumer side includes buyers, viewers, or guests, etc. To optimize an online recommender system, A/B tests serve as the golden standard for comparing different ranking models and evaluating their impact on both the consumers and producers. While consumer-side experiments are relatively straightforward to design and commonly used to gauge the impact of ranking changes on the behavior of consumers (buyers, viewers, etc.), designing producer-side experiments presents a considerable challenge because producer items in the treatment and control groups need to be ranked by different models and then merged into a single ranking for the recommender to show to each consumer. In this paper, we review issues with the existing methods, propose new design principles for producer-side experiments, and develop a rigorous solution based on counterfactual interleaving designs for accurately measuring the effects of ranking changes on the producers (sellers, creators, etc.).
翻译:推荐系统已成为在线平台不可或缺的组成部分,为购买商品、消费内容以及人际连接提供个性化建议。在线推荐系统包含两类组件:生产者端涵盖产品卖家、内容创作者或服务提供商等,消费者端则包括买家、观众或访客等。为优化在线推荐系统,A/B测试被视作比较不同排序模型及评估其对消费者与生产者影响的黄金标准。尽管消费者端实验设计相对直观,常用于衡量排序变化对消费者(买家、观众等)行为的影响,但生产者端实验的设计面临显著挑战——因为处理组与对照组的生产者项目需通过不同模型排序,而后合并为单个排序结果供推荐系统向每位消费者展示。本文对现有方法存在的问题进行了系统评述,提出了生产者端实验的新型设计原则,并基于反事实交错设计开发了严谨方案,能够精确测量排序变化对生产者(卖家、创作者等)的影响。