Generative AI models reproduce the biases in the training data and can further amplify them through mode collapse. We refer to the resulting harmful loss of diversity as homogenization. Our position is that homogenization should be a primary concern in AI safety. We introduce xeno-reproduction as the strategy that mitigates homogenization. For auto-regressive LLMs, we formalize xeno-reproduction as a structure-aware diversity pursuit. Our contribution is foundational, intended to open an essential line of research and invite collaboration to advance diversity.
翻译:生成式人工智能模型会复现训练数据中的偏差,并可能通过模式坍塌进一步放大这些偏差。我们将这种因多样性丧失而产生的有害结果称为同质化。我们的观点是,同质化应成为人工智能安全领域的核心关切。我们提出异质再生产策略,旨在缓解同质化问题。针对自回归大语言模型,我们将异质再生产形式化为一种结构感知的多样性追求。我们的贡献具有基础性,旨在开辟一条必要的研究方向,并邀请业界合作共同推进多样性发展。