Enhancing Long-Term Welfare in Recommender Systems: An Information Revelation Approach

Improving the long-term user welfare (e.g., sustained user engagement) has become a central objective of recommender systems (RS). In real-world platforms, the creation behaviors of content creators plays a crucial role in shaping long-term welfare beyond short-term recommendation accuracy, making the effective steering of creator behavior essential to foster a healthier RS ecosystem. Existing works typically rely on re-ranking algorithms that heuristically adjust item exposure to steer creators' behavior. However, when embedded within recommendation pipelines, such a strategy often conflicts with the short-term objective of improving recommendation accuracy, leading to performance degradation and suboptimal long-term welfare. The well-established economics studies offer us valuable insights for an alternative approach without relying on recommendation algorithmic design: revealing information from an information-rich party (sender) to a less-informed party (receiver) can effectively change the receiver's beliefs and steer their behavior. Inspired by this idea, we propose an information-revealing framework, named Long-term Welfare Optimization via Information Revelation (LoRe). In this framework, we utilize a classical information revelation method (i.e., Bayesian persuasion) to map the stakeholders in RS, treating the platform as the sender and creators as the receivers. To address the challenge posed by the unrealistic assumption of traditional economic methods, we formulate the process of information revelation as a Markov Decision Process (MDP) and propose a learning algorithm trained and inferred in environments with boundedly rational creators. Extensive experiments on two real-world RS datasets demonstrate that our method can effectively outperform existing fair re-ranking methods and information revealing strategies in improving long-term user welfare.

翻译：提升长期用户福利（如持续的用户参与度）已成为推荐系统（RS）的核心目标。在实际平台中，内容创作者的创作行为对于塑造超越短期推荐准确性的长期福利起着关键作用，这使得有效引导创作者行为对于培育更健康的推荐系统生态至关重要。现有研究通常依赖于启发式调整项目曝光度的重排序算法来引导创作者行为。然而，当此类策略嵌入推荐流程时，常与提升推荐准确性的短期目标相冲突，导致性能下降和次优的长期福利。成熟的经济学研究为我们提供了一种不依赖推荐算法设计的替代思路：将信息从信息丰富方（发送者）揭示给信息匮乏方（接收者）能有效改变接收者的信念并引导其行为。受此启发，我们提出了一种名为“通过信息揭示优化长期福利”（LoRe）的信息揭示框架。在该框架中，我们采用经典的信息揭示方法（即贝叶斯劝说）来映射推荐系统中的利益相关者，将平台视为发送者，创作者视为接收者。为应对传统经济方法中不切实际假设带来的挑战，我们将信息揭示过程建模为马尔可夫决策过程（MDP），并提出一种在有限理性创作者环境中训练和推断的学习算法。在两个真实世界推荐系统数据集上的大量实验表明，我们的方法在提升长期用户福利方面能有效超越现有的公平重排序方法与信息揭示策略。