The deployment of NLP systems has raised concerns about harms they might produce, including representational harms. Recent literature has begun to conceptualize and measure one such harm, the harm of erasure. Nevertheless, the field lacks a clear and cohesive conceptual foundation for identifying and measuring erasure. Existing conceptualizations of erasure are often broad -- making it difficult to identify what is needed to establish and measure erasure -- or else specific to particular settings -- facilitating measurement for those settings but potentially challenging to adapt to other settings. To address this gap, we develop and propose a structured definition of erasure that clarifies what components are necessary for establishing whether erasure has occurred, which practitioners need to explicitly articulate and operationalize in order to measure erasure.
翻译:自然语言处理系统的部署引发了对其可能产生的伤害的担忧,包括表征性伤害。近期文献开始概念化并衡量其中一种伤害,即消除性伤害。然而,该领域在识别和衡量消除性伤害方面缺乏清晰且连贯的概念基础。现有的消除性伤害概念化通常较为宽泛——这使得难以确定建立和衡量消除性伤害所需的内容——或者特定于特定情境——便于在这些情境下进行衡量,但可能难以适应其他情境。为填补这一空白,我们开发并提出了一种结构化的消除性伤害定义,明确了确立是否发生消除性伤害所必需的构成要素,这些要素需要从业者明确阐述和操作化,以便衡量消除性伤害。