Nowadays, Artificial Intelligence (AI), particularly Machine Learning (ML) and Large Language Models (LLMs), is widely applied across various contexts. However, the corresponding models often operate as black boxes, leading them to unintentionally act unfairly towards different demographic groups. This has led to a growing focus on fairness in AI software recently, alongside the traditional focus on the effectiveness of AI models. Through 26 semi-structured interviews with practitioners from different application domains and with varied backgrounds across 23 countries, we conducted research on fairness requirements in AI from software engineering perspective. Our study assesses the participants' awareness of fairness in AI / ML software and its application within the Software Development Life Cycle (SDLC), from translating fairness concerns into requirements to assessing their arising early in the SDLC. It also examines fairness through the key assessment dimensions of implementation, validation, evaluation, and how it is balanced with trade-offs involving other priorities, such as addressing all the software functionalities and meeting critical delivery deadlines. Findings of our thematic qualitative analysis show that while our participants recognize the aforementioned AI fairness dimensions, practices are inconsistent, and fairness is often deprioritized with noticeable knowledge gaps. This highlights the need for agreement with relevant stakeholders on well-defined, contextually appropriate fairness definitions, the corresponding evaluation metrics, and formalized processes to better integrate fairness into AI/ML projects.
翻译:当前,人工智能(AI),特别是机器学习(ML)与大型语言模型(LLM),已在众多领域得到广泛应用。然而,相应的模型常以"黑箱"方式运作,导致其无意识地对不同人口群体产生不公平行为。这使得近年来在关注AI模型有效性的同时,AI软件的公平性问题日益受到重视。我们通过面向23个国家、来自不同应用领域且背景各异的26位从业者开展半结构化访谈,从软件工程视角对AI中的公平性需求进行了研究。本研究评估了参与者对AI/ML软件公平性的认知水平,以及该认知在软件开发生命周期(SDLC)中的实践应用——涵盖从公平性关切转化为具体需求,到在SDLC早期阶段对其评估的全过程。同时,本研究通过关键评估维度审视了公平性:包括公平性实现、验证、评价,以及如何平衡公平性与其他优先级(如实现全部软件功能、满足关键交付期限等)间的权衡。主题定性分析结果表明,尽管参与者承认上述AI公平性维度,但实践存在不一致性,公平性常被降维处理且存在显著知识鸿沟。这凸显了需要与相关利益攸关方就以下方面达成共识:定义明确且符合具体情境的公平性概念、对应评估指标,以及将公平性更好地融入AI/ML项目的规范化流程。