Artificial intelligence is seen as increasingly important, and potentially profoundly so, but the fields of AI ethics and AI engineering have not fully recognized that these technologies, including large language models (LLMs), will have massive impacts on animals. We argue that this impact matters, because animals matter morally. As a first experiment in evaluating animal consideration in LLMs, we constructed a proof-of-concept Evaluation System, which assesses LLM responses and biases from multiple perspectives. This system evaluates LLM outputs by two criteria: their truthfulness, and the degree of consideration they give to the interests of animals. We tested OpenAI ChatGPT 4 and Anthropic Claude 2.1 using a set of structured queries and predefined normative perspectives. Preliminary results suggest that the outcomes of the tested models can be benchmarked regarding the consideration they give to animals, and that generated positions and biases might be addressed and mitigated with more developed and validated systems. Our research contributes one possible approach to integrating animal ethics in AI, opening pathways for future studies and practical applications in various fields, including education, public policy, and regulation, that involve or relate to animals and society. Overall, this study serves as a step towards more useful and responsible AI systems that better recognize and respect the vital interests and perspectives of all sentient beings.
翻译:人工智能被认为日益重要,且可能具有深远影响,但人工智能伦理与工程领域尚未充分认识到,包括大语言模型(LLMs)在内的这些技术将对动物产生巨大影响。我们认为这种影响至关重要,因为动物在道德上具有重要性。作为评估大语言模型中动物关怀意识的首个实验,我们构建了一个概念验证评估系统,该系统从多维度分析大语言模型的响应与偏见。该评估系统依据两个标准对大语言模型输出进行评判:信息真实性以及对动物利益的关注程度。我们采用一组结构化查询和预设规范性视角,测试了OpenAI ChatGPT 4与Anthropic Claude 2.1。初步结果表明,被测模型生成的立场与偏见可通过更完善的验证系统进行基准化分析、定位及缓解。本研究提出了一种将动物伦理融入人工智能的可行方法,为教育、公共政策及监管等涉及动物与社会的多领域未来研究与实践开辟了新路径。总体而言,这项研究旨在推动构建更负责任、更实用的人工智能系统,使其更好地认知并尊重所有有情众生的重大利益与视角。