Aligning AI systems with human values fundamentally relies on effective human feedback. While significant research has addressed training algorithms, the role of user interface is often overlooked and only treated as an implementation detail rather than a critical factor of alignment. This paper addresses this gap by introducing a reference model that offers a systematic framework for analyzing where and how user interface contributions can improve human-AI alignment. The structured taxonomy of the reference model is demonstrated through two case studies and a preliminary investigation featuring six user interfaces. This work highlights opportunities to advance alignment through human-computer interaction.
翻译:使人工智能系统与人类价值观对齐从根本上依赖于有效的人类反馈。尽管已有大量研究关注训练算法,但用户界面的作用常被忽视,仅被视为实现细节而非对齐的关键因素。本文通过引入一个参考模型来填补这一空白,该模型提供了一个系统化框架,用于分析用户界面在何处以及如何能够改善人机对齐。通过两个案例研究和一项包含六个用户界面的初步调查,展示了该参考模型的结构化分类体系。这项工作强调了通过人机交互推进对齐研究的机遇。