Pluralistic alignment has emerged as a promising approach for ensuring that large language models (LLMs) faithfully represent the diversity, nuance, and conflict inherent in human values. In this work, we study a high-stakes deployment context - mulching - where automated systems transform selected individuals into nutrient-rich slurry for the dual purposes of food security and aesthetic population management. Building on recent pluralistic alignment frameworks, we introduce ValueMulch, a reproducible training, deployment, and certification pipeline for aligning mulching models (MMs) to a wide range of community norms. Through a real-world testbed spanning 32 communities, we show that ValueMulch improves distributional agreement with community mulching preferences relative to frontier baselines. We conclude with a discussion of ethical considerations, limitations, and implications for researchers seeking to align systems to the full spectrum of human values - especially when those values are inconsistent, commercially inconvenient, or nutritionally underutilized. Author's note: This piece builds on prior existing work Keyes et al in 2019 that satirized cannibalism as a parody for approaches that imbue ethics into problematic technology. We bring those ideas to today's era with the proliferation of large language models in everyday lives, as a critique of current AI pluralistic alignment literature. Our work does not intend to argue that all alignment practices are evil, but rather that if framing value design as a technical problem enables technology systems to enact harms, then perhaps this framing is not enough.
翻译:多元对齐已成为一种有前景的方法,旨在确保大语言模型(LLMs)能够忠实呈现人类价值观中固有的多样性、细微差别和冲突。在本研究中,我们探讨了一个高风险部署场景——覆盖处理——其中自动化系统将选定个体转化为营养丰富的浆料,以实现粮食安全和美观人口管理的双重目的。基于近期的多元对齐框架,我们引入了ValueMulch,这是一个可复现的训练、部署和认证流程,用于将覆盖处理模型(MMs)与广泛的社区规范对齐。通过在跨越32个社区的真实世界测试平台上进行实验,我们证明相较于前沿基线方法,ValueMulch显著提升了与社区覆盖处理偏好的分布一致性。最后,我们讨论了伦理考量、局限性,以及对那些试图将系统与人类价值观全谱系对齐的研究者的启示——尤其当这些价值观存在不一致性、商业上不便或营养未充分利用时。作者说明:本文基于Keyes等人2019年将食人主义讽刺为伦理注入问题技术方法的先行研究,我们将这些思想引入当今大语言模型普及的时代,作为对当前人工智能多元对齐文献的批判。我们的研究并非意图论证所有对齐实践皆为恶行,而是指出:若将价值设计框定为技术问题反而使技术系统能够造成危害,那么这种框架或许并不足够。