The United Nations' Sustainable Development Goals (SDGs) provide a globally recognised framework for addressing critical societal, environmental, and economic challenges. Recent developments in natural language processing (NLP) and large language models (LLMs) have facilitated the automatic classification of textual data according to their relevance to specific SDGs. Nevertheless, in many applications, it is equally important to determine the directionality of this relevance; that is, to assess whether the described impact is positive, neutral, or negative. To tackle this challenge, we propose the novel task of SDG polarity detection, which assesses whether a text segment indicates progress toward a specific SDG or conveys an intention to achieve such progress. To support research in this area, we introduce SDG-POD, a benchmark dataset designed specifically for this task, combining original and synthetically generated data. We perform a comprehensive evaluation using six state-of-the-art large LLMs, considering both zero-shot and fine-tuned configurations. Our results suggest that the task remains challenging for the current generation of LLMs. Nevertheless, some fine-tuned models, particularly QWQ-32B, achieve good performance, especially on specific Sustainable Development Goals such as SDG-9 (Industry, Innovation and Infrastructure), SDG-12 (Responsible Consumption and Production), and SDG-15 (Life on Land). Furthermore, we demonstrate that augmenting the fine-tuning dataset with synthetically generated examples yields improved model performance on this task. This result highlights the effectiveness of data enrichment techniques in addressing the challenges of this resource-constrained domain. This work advances the methodological toolkit for sustainability monitoring and provides actionable insights into the development of efficient, high-performing polarity detection systems.
翻译:联合国可持续发展目标为应对关键的社会、环境与经济挑战提供了一个全球公认的框架。自然语言处理与大型语言模型的最新进展,促进了根据文本数据与特定可持续发展目标的相关性进行自动分类。然而,在许多应用中,判断这种相关性的方向性同样重要;即评估所述影响是积极、中性还是消极的。为应对这一挑战,我们提出了可持续发展目标极性检测这一新任务,旨在评估文本片段是否表明在特定可持续发展目标方面取得进展,或传达了实现此类进展的意图。为支持该领域的研究,我们引入了SDG-POD——一个专门为此任务设计的基准数据集,该数据集结合了原始数据与合成生成的数据。我们使用六种最先进的大型语言模型进行了全面评估,考虑了零样本与微调两种配置。我们的结果表明,该任务对当前这一代大型语言模型而言仍具挑战性。尽管如此,一些经过微调的模型,特别是QWQ-32B,取得了良好的性能,尤其是在特定的可持续发展目标上,例如可持续发展目标9(产业、创新和基础设施)、可持续发展目标12(负责任消费和生产)以及可持续发展目标15(陆地生物)。此外,我们证明了通过合成生成的示例来增强微调数据集,可以提升模型在此任务上的性能。这一结果突显了数据增强技术在应对这一资源受限领域挑战方面的有效性。本工作推进了可持续性监测的方法工具箱,并为开发高效、高性能的极性检测系统提供了可行的见解。