Online marketplaces use rating systems to promote discovery of high quality products. However, these systems also lead to high variance in producers' economic outcomes: a new producer who sells high-quality items, may, by luck, receive one low rating early on, negatively impacting their popularity with future customers. We investigate the design of rating systems that balance the goals of identifying high quality products ("efficiency") and minimizing the variance in economic outcomes of producers of similar quality (individual "producer fairness"). We observe that there is a trade-off between these two goals: rating systems that promote efficiency are necessarily less individually fair to producers. We introduce Bayesian rating systems as an approach to managing this trade-off. Informally, the systems we propose set a system-wide prior for the quality of an incoming product, and subsequently the system updates that prior to a Bayesian posterior on quality based on user-generated ratings over time. Through calibrated simulations, we show that the strength of the prior directly determines the operating point on the identified trade-off: the stronger the prior, the more the marketplace discounts early ratings data (so individual producer fairness increases), but the slower the platform is in learning about true item quality (so efficiency suffers). Importantly, the prevailing method of ratings aggregation -- displaying the sample mean of ratings -- is an extreme point in this design space, that maximally prioritizes efficiency at the expense of producer fairness. Instead, by choosing a Bayesian rating system design with an appropriately set prior, a platform can be intentional about the consequential choice of a balance between efficiency and producer fairness.
翻译:在线市场利用评分系统促进高质量产品的发现。然而,这些系统也导致生产者经济收益的高度波动:一名销售高质量商品的新生产者,可能因运气不佳而早先获得一个低分,从而对其未来客户的受欢迎程度产生负面影响。我们研究评分系统的设计,旨在平衡识别高质量产品("效率")与最小化相似质量生产者经济收益方差(个体"生产者公平性")这两个目标。我们发现这两个目标之间存在权衡:促进效率的评分系统必然对生产者个体公平性较低。我们引入贝叶斯评分系统作为管理这种权衡的方法。非正式地说,我们提出的系统为进入市场的新产品设定一个系统范围的先验质量,随后系统基于用户生成的时序评分,将该先验更新为质量的后验贝叶斯估计。通过校准模拟,我们表明先验的强度直接决定了上述权衡中的工作点:先验越强,市场对早期评分数据的折扣越大(因此个体生产者公平性增加),但平台了解真正商品质量的速度越慢(因此效率受损)。重要的是,当前主流的评分汇总方法——显示评分的样本均值——是该设计空间中的一个极端点,它最大程度地优先考虑效率而牺牲生产者公平性。相反,通过选择适当设置先验的贝叶斯评分系统设计,平台可以有意识地在效率与生产者公平性之间做出权衡选择。