AI and NLP publication venues have increasingly encouraged researchers to reflect on possible ethical considerations, adverse impacts, and other responsible AI issues their work might engender. However, for specific NLP tasks our understanding of how prevalent such issues are, or when and why these issues are likely to arise, remains limited. Focusing on text summarization -- a common NLP task largely overlooked by the responsible AI community -- we examine research and reporting practices in the current literature. We conduct a multi-round qualitative analysis of 333 summarization papers from the ACL Anthology published between 2020-2022. We focus on how, which, and when responsible AI issues are covered, which relevant stakeholders are considered, and mismatches between stated and realized research goals. We also discuss current evaluation practices and consider how authors discuss the limitations of both prior work and their own work. Overall, we find that relatively few papers engage with possible stakeholders or contexts of use, which limits their consideration of potential downstream adverse impacts or other responsible AI issues. Based on our findings, we make recommendations on concrete practices and research directions.
翻译:人工智能与自然语言处理领域的出版机构日益鼓励研究者反思其工作可能引发的伦理问题、负面效应及其他负责任AI相关议题。然而,针对特定自然语言处理任务,我们对于此类问题的普遍性、触发条件及根源的认知仍然有限。本研究聚焦文本摘要这一常被负责任AI社区忽视的常见自然语言处理任务,系统考察了当前文献中的研究与报告实践。我们对2020-2022年间发表于ACL文集的333篇摘要论文进行了多轮定性分析,重点关注:负责任AI议题的覆盖方式、类型与时间节点;相关利益攸关方的考量情况;以及研究目标陈述与实际达成之间的偏差。我们还探讨了当前评估实践,并分析作者如何讨论前人工作及自身工作的局限性。总体而言,我们发现仅有少数论文真正关注潜在利益攸关方或使用场景,这限制了其对下游负面效应或其他负责任AI风险的深入考量。基于研究结论,我们提出了具体实践建议与未来研究方向。