With the recent advances in natural language processing (NLP), a vast number of applications have emerged across various use cases. Among the plethora of NLP applications, many academic researchers are motivated to do work that has a positive social impact, in line with the recent initiatives of NLP for Social Good (NLP4SG). However, it is not always obvious to researchers how their research efforts are tackling today's big social problems. Thus, in this paper, we introduce NLP4SGPAPERS, a scientific dataset with three associated tasks that can help identify NLP4SG papers and characterize the NLP4SG landscape by: (1) identifying the papers that address a social problem, (2) mapping them to the corresponding UN Sustainable Development Goals (SDGs), and (3) identifying the task they are solving and the methods they are using. Using state-of-the-art NLP models, we address each of these tasks and use them on the entire ACL Anthology, resulting in a visualization workspace that gives researchers a comprehensive overview of the field of NLP4SG. Our website is available at https://nlp4sg.vercel.app . We released our data at https://huggingface.co/datasets/feradauto/NLP4SGPapers and code at https://github.com/feradauto/nlp4sg .
翻译:随着自然语言处理(NLP)的最新进展,大量应用已在各种用例中涌现。在众多NLP应用中,许多学术研究者受近期“NLP促进社会公益”(NLP for Social Good, NLP4SG)倡议的启发,致力于从事具有积极社会影响的工作。然而,研究者并不总能明确了解自身的研究努力如何应对当今重大社会问题。为此,本文引入NLP4SGPAPERS这一科学数据集,并附带三项关联任务,通过以下方式帮助识别NLP4SG论文并刻画其研究格局:(1)识别涉及社会问题的论文,(2)将其映射至对应的联合国可持续发展目标(SDGs),以及(3)识别这些论文所解决的任务与采用的方法。我们利用最先进的NLP模型处理每项任务,并将其应用于整个ACL文集,由此构建出一个可视化工作空间,为研究者提供NLP4SG领域的全景概览。我们的网站访问地址为 https://nlp4sg.vercel.app 。我们已在 https://huggingface.co/datasets/feradauto/NLP4SGPapers 发布数据,并在 https://github.com/feradauto/nlp4sg 发布代码。