Auditing Yelp's Business Ranking and Review Recommendation Through the Lens of Fairness

Web 2.0 recommendation systems, such as Yelp, connect users and businesses so that users can identify new businesses and simultaneously express their experiences in the form of reviews. Yelp recommendation software moderates user-provided content by categorizing them into recommended and not-recommended sections. Due to Yelp's substantial popularity and its high impact on local businesses' success, understanding the fairness of its algorithms is crucial. However, with no access to the training data and the algorithms used by such black-box systems, studying their fairness is not trivial, requiring a tremendous effort to minimize bias in data collection and consider the confounding factors in the analysis. This large-scale data-driven study, for the first time, investigates Yelp's business ranking and review recommendation system through the lens of fairness. We define and examine 4 hypotheses to examine if Yelp's recommendation software shows bias and if Yelp's business ranking algorithm shows bias against restaurants located in specific neighborhoods. Our findings show that reviews of female and less-established users are disproportionately categorized as recommended. We also find a positive association between restaurants being located in hotspot regions and their average exposure. Furthermore, we observed some cases of severe disparity bias in cities where the hotspots are in neighborhoods with less demographic diversity or areas with higher affluence and education levels. Indeed, biases introduced by data-driven systems, including our findings in this paper, are (almost) always implicit and through proxy attributes. Still, the authors believe such implicit biases should be detected and resolved as those can create cycles of discrimination that keep increasing the social gaps between different groups even further.

翻译：Web 2.0 推荐系统（如 Yelp）连接用户与商家，使用户能够发现新商家，并同时以评论形式表达自身体验。Yelp 推荐软件通过将用户生成内容归类为“推荐”与“不推荐”两个部分来进行内容审核。鉴于 Yelp 的广泛知名度及其对本地商家成功的重大影响，理解其算法的公平性至关重要。然而，由于无法访问此类黑箱系统的训练数据及所用算法，研究其公平性并非易事，这需要付出巨大努力以最小化数据收集中的偏差，并在分析中考虑混杂因素。这项基于大规模数据的研究首次从公平性视角审视 Yelp 的商业排名与评论推荐系统。我们定义并检验了四个假设，以考察 Yelp 的推荐软件是否存在偏见，以及 Yelp 的商业排名算法是否对位于特定街区的餐厅存在偏见。研究结果显示，女性用户及资历较浅用户的评论被不成比例地归类为“推荐”。我们还发现，餐厅位于热点区域与其平均曝光率之间存在正相关。此外，我们观察到一些城市中存在严重的差异偏见案例，这些城市的热点区域位于人口多样性较低或富裕程度与教育水平较高的街区。事实上，数据驱动系统引入的偏见（包括本文的研究发现）几乎总是隐性的，并通过代理属性体现。尽管如此，作者认为此类隐性偏见应被检测并解决，因为它们可能形成歧视循环，进一步加剧不同群体之间的社会差距。