Online freelance marketplaces, a rapidly growing part of the global labor market, are creating a fair environment where professional skills are the main factor for hiring. While these platforms can reduce bias from traditional hiring, the personal information in user profiles raises concerns about ongoing discrimination. Past studies on this topic have mostly used existing data, which makes it hard to control for other factors and clearly see the effect of things like gender or race. To solve these problems, this paper presents a new method that uses Retrieval-Augmented Generation (RAG) with a Large Language Model (LLM) to create realistic, artificial freelancer profiles for controlled experiments. This approach effectively separates individual factors, enabling a clearer statistical analysis of how different variables influence the freelancer project process. In addition to analyzing extracted data with traditional statistical methods for post-project stage analysis, our research utilizes a dataset with highly controlled variables, generated by an RAG-LLM, to conduct a simulated hiring experiment for pre-project stage analysis. The results of our experiments show that, regarding gender, while no significant preference emerged in initial hiring decisions, female freelancers are substantially more likely to receive imperfect ratings post-project stage. Regarding regional bias, a strong and consistent preference favoring US-based freelancers shows that people are more likely to be selected in the simulated experiments, perceived as more leader-like, and receive higher ratings on the live platform.
翻译:在线自由职业市场作为全球劳动力市场中快速增长的一部分,旨在创造一个以专业技能为主要聘用标准的公平环境。尽管这些平台能够减少传统招聘中的偏见,但用户档案中的个人信息仍引发了人们对持续存在的歧视现象的担忧。以往关于该主题的研究大多使用现有数据,这使得控制其他变量并清晰观察性别或种族等因素的影响变得困难。为解决这些问题,本文提出一种新方法,该方法利用检索增强生成技术与大型语言模型创建逼真的人工自由职业者档案以进行受控实验。该方法能有效分离个体因素,从而更清晰地统计分析不同变量如何影响自由职业者项目流程。除了使用传统统计方法分析提取的数据以进行项目后阶段分析外,本研究还利用由RAG-LLM生成的、具有高度受控变量的数据集,开展模拟招聘实验以进行项目前阶段分析。实验结果表明:在性别方面,虽然初始聘用决策中未出现显著偏好,但女性自由职业者在项目后阶段获得非完美评价的概率显著更高。在地域偏见方面,对基于美国的自由职业者存在强烈且一致的偏好,表现为在模拟实验中更可能被选中、被认为更具领导力特质,并在实际平台上获得更高评分。