To investigate the well-observed racial disparities in computer vision systems that analyze images of humans, researchers have turned to skin tone as more objective annotation than race metadata for fairness performance evaluations. However, the current state of skin tone annotation procedures is highly varied. For instance, researchers use a range of untested scales and skin tone categories, have unclear annotation procedures, and provide inadequate analyses of uncertainty. In addition, little attention is paid to the positionality of the humans involved in the annotation process--both designers and annotators alike--and the historical and sociological context of skin tone in the United States. Our work is the first to investigate the skin tone annotation process as a sociotechnical project. We surveyed recent skin tone annotation procedures and conducted annotation experiments to examine how subjective understandings of skin tone are embedded in skin tone annotation procedures. Our systematic literature review revealed the uninterrogated association between skin tone and race and the limited effort to analyze annotator uncertainty in current procedures for skin tone annotation in computer vision evaluation. Our experiments demonstrated that design decisions in the annotation procedure such as the order in which the skin tone scale is presented or additional context in the image (i.e., presence of a face) significantly affected the resulting inter-annotator agreement and individual uncertainty of skin tone annotations. We call for greater reflexivity in the design, analysis, and documentation of procedures for evaluation using skin tone.
翻译:为探究计算机视觉系统在分析人类图像时广泛观察到的种族差异,研究者将肤色视为比种族元数据更客观的标注指标用于公平性评估。然而当前肤色标注流程存在高度不统一性:研究者使用未经检验的多种尺度与肤色类别、标注流程不清晰、对不确定性分析不足。此外,标注流程中人工参与者(包括设计者与标注者)的立场性,以及美国肤色标注的历史社会背景鲜有关注。本研究首次从社会技术视角考察肤色标注流程,通过系统梳理近期肤色标注方法并开展标注实验,探讨主观认知如何嵌入肤色标注实践。系统文献综述揭示了肤色与种族之间未经审察的关联性,以及现有计算机视觉评估流程中对标注者不确定性分析的局限性。实验表明,标注流程中的设计决策(如肤色量表呈现顺序、图像附加语境即面部存在与否)会显著影响标注者间一致性与个体标注不确定性。我们呼吁在采用肤色的评估流程设计、分析与文档编制中加强反身性思考。