Gaining insight into the potential negative impacts of emerging Artificial Intelligence (AI) technologies in society is a challenge for implementing anticipatory governance approaches. One approach to produce such insight is to use Large Language Models (LLMs) to support and guide experts in the process of ideating and exploring the range of undesirable consequences of emerging technologies. However, performance evaluations of LLMs for such tasks are still needed, including examining the general quality of generated impacts but also the range of types of impacts produced and resulting biases. In this paper, we demonstrate the potential for generating high-quality and diverse impacts of AI in society by fine-tuning completion models (GPT-3 and Mistral-7B) on a diverse sample of articles from news media and comparing those outputs to the impacts generated by instruction-based (GPT-4 and Mistral-7B-Instruct) models. We examine the generated impacts for coherence, structure, relevance, and plausibility and find that the generated impacts using Mistral-7B, a small open-source model fine-tuned on impacts from the news media, tend to be qualitatively on par with impacts generated using a more capable and larger scale model such as GPT-4. Moreover, we find that impacts produced by instruction-based models had gaps in the production of certain categories of impacts in comparison to fine-tuned models. This research highlights a potential bias in the range of impacts generated by state-of-the-art LLMs and the potential of aligning smaller LLMs on news media as a scalable alternative to generate high quality and more diverse impacts in support of anticipatory governance approaches.
翻译:深入了解新兴人工智能(AI)技术在社会中的潜在负面影响,是实施预期性治理方法所面临的挑战。一种产生此类洞察的方法是利用大语言模型(LLMs)来支持和引导专家构思并探索新兴技术可能带来的不良后果的范围。然而,仍需对LLMs在此类任务中的性能进行评估,包括考察生成影响的总体质量、产生影响的类型范围以及由此产生的偏见。在本文中,我们通过在新间媒体多样化样本上微调完成模型(GPT-3和Mistral-7B),并将这些输出与基于指令的模型(GPT-4和Mistral-7B-Instruct)生成的影响进行比较,展示了生成AI对社会的优质且多样化影响的潜力。我们检查了生成影响的一致性、结构、相关性和合理性,发现使用Mistral-7B(一个在新闻媒体影响数据上微调的小型开源模型)生成的影响,在质量上往往与使用更强大、更大规模模型(如GPT-4)生成的影响相当。此外,我们发现与微调模型相比,基于指令的模型生成的影响在某些类别上存在缺口。这项研究突显了最先进LLMs生成影响范围中潜在的偏见,以及将小型LLMs在新闻媒体上进行对齐作为可扩展替代方案的潜力,以生成支持预期性治理方法的高质量且更多样化的影响。