For content recommender systems such as TikTok and YouTube, the platform's decision algorithm shapes the incentives of content producers, including how much effort the content producers invest in the quality of their content. Many platforms employ online learning, which creates intertemporal incentives, since content produced today affects recommendations of future content. In this paper, we study the incentives arising from online learning, analyzing the quality of content produced at a Nash equilibrium. We show that classical online learning algorithms, such as Hedge and EXP3, unfortunately incentivize producers to create low-quality content. In particular, the quality of content is upper bounded in terms of the learning rate and approaches zero for typical learning rate schedules. Motivated by this negative result, we design a different learning algorithm -- based on punishing producers who create low-quality content -- that correctly incentivizes producers to create high-quality content. At a conceptual level, our work illustrates the unintended impact that a platform's learning algorithm can have on content quality and opens the door towards designing platform learning algorithms that incentivize the creation of high-quality content.
翻译:对于TikTok和YouTube等内容推荐系统,平台的决策算法塑造了内容生产者的激励,包括他们在内容质量上投入的努力程度。许多平台采用在线学习,这会产生跨期激励,因为今天生产的内容会影响未来内容的推荐。本文研究了在线学习所产生的激励,分析了纳什均衡下生产内容的质量。我们证明,经典的在线学习算法(如Hedge和EXP3)不幸地激励生产者创作低质量内容。具体而言,内容质量受学习率的上界限制,并且对于典型的学习率调度方案趋近于零。基于这一负面结果,我们设计了一种基于惩罚生产低质量内容生产者的不同学习算法,该算法能正确激励生产者创作高质量内容。在概念层面上,我们的工作揭示了平台学习算法可能对内容质量产生的意外影响,并为设计激励高质量内容创作的平台学习算法打开了大门。