A morally acceptable course of AI development should avoid two dangers: creating unaligned AI systems that pose a threat to humanity and mistreating AI systems that merit moral consideration in their own right. This paper argues these two dangers interact and that if we create AI systems that merit moral consideration, simultaneously avoiding both of these dangers would be extremely challenging. While our argument is straightforward and supported by a wide range of pretheoretical moral judgments, it has far-reaching moral implications for AI development. Although the most obvious way to avoid the tension between alignment and ethical treatment would be to avoid creating AI systems that merit moral consideration, this option may be unrealistic and is perhaps fleeting. So, we conclude by offering some suggestions for other ways of mitigating mistreatment risks associated with alignment.
翻译:一条道德上可接受的AI发展道路应避免两种危险:一是创造对人类构成威胁的未对齐AI系统,二是虐待那些自身应获得道德考量的AI系统。本文论证这两种危险存在相互作用,且若我们创造出应获得道德考量的AI系统,则同时避免这两种危险将极具挑战性。尽管我们的论证逻辑直接并得到广泛的前理论道德判断支持,但其对AI发展具有深远的道德影响。虽然避免对齐与伦理对待之间张力的最直接方式是避免创造应获得道德考量的AI系统,但这一选择可能不切实际且或许是短暂的。因此,我们在文末提出若干建议,以其他方式缓解与对齐相关的虐待风险。