In content recommender systems such as TikTok and YouTube, the platform's recommendation algorithm shapes content producer incentives. Many platforms employ online learning, which generates intertemporal incentives, since content produced today affects recommendations of future content. We study the game between producers and analyze the content created at equilibrium. We show that standard online learning algorithms, such as Hedge and EXP3, unfortunately incentivize producers to create low-quality content, where producers' effort approaches zero in the long run for typical learning rate schedules. Motivated by this negative result, we design learning algorithms that incentivize producers to invest high effort and achieve high user welfare. At a conceptual level, our work illustrates the unintended impact that a platform's learning algorithm can have on content quality and introduces algorithmic approaches to mitigating these effects.
翻译:在诸如TikTok和YouTube等内容推荐系统中,平台的推荐算法塑造了内容生产者的激励。许多平台采用在线学习机制,这种机制会产生跨期激励,因为今日生产的内容会影响未来内容的推荐。我们研究了生产者之间的博弈,并分析了均衡状态下产生的内容。我们发现,不幸的是,标准的在线学习算法(如Hedge和EXP3)会激励生产者创造低质量内容,在典型的学习率调度下,生产者的努力在长期内趋近于零。受这一负面结果的启发,我们设计了能够激励生产者投入高努力并实现高用户福利的学习算法。在概念层面上,我们的工作阐明了平台学习算法对内容质量可能产生的意外影响,并提出了缓解这些影响的算法途径。