While generative AI systems have gained popularity in diverse applications, their potential to produce harmful outputs limits their trustworthiness and usability in different applications. Recent years have seen growing interest in engaging diverse AI users in auditing generative AI that might impact their lives. To this end, we propose MIRAGE as a web-based tool where AI users can compare outputs from multiple AI text-to-image (T2I) models by auditing AI-generated images, and report their findings in a structured way. We used MIRAGE to conduct a preliminary user study with five participants and found that MIRAGE users could leverage their own lived experiences and identities to surface previously unnoticed details around harmful biases when reviewing multiple T2I models' outputs compared to reviewing only one.
翻译:尽管生成式人工智能系统已在多种应用中普及,但其产生有害输出的潜力限制了其在各类应用中的可信度与可用性。近年来,让不同人工智能用户参与审计可能影响其生活的生成式人工智能的兴趣日益增长。为此,我们提出MIRAGE作为一种基于网络的工具,使人工智能用户能够通过审计人工智能生成的图像,比较多个AI文本到图像(T2I)模型的输出,并以结构化方式报告其发现。我们利用MIRAGE对五名参与者进行了初步用户研究,发现相较于仅审查单一模型输出,MIRAGE用户能够利用自身的生活经验和身份背景,在审查多个T2I模型输出时揭示出先前未被注意到的有害偏见细节。