The burgeoning integration of artificial intelligence (AI) into human society brings forth significant implications for societal governance and safety. While considerable strides have been made in addressing AI alignment challenges, existing methodologies primarily focus on technical facets, often neglecting the intricate sociotechnical nature of AI systems, which can lead to a misalignment between the development and deployment contexts. To this end, we posit a new problem worth exploring: Incentive Compatibility Sociotechnical Alignment Problem (ICSAP). We hope this can call for more researchers to explore how to leverage the principles of Incentive Compatibility (IC) from game theory to bridge the gap between technical and societal components to maintain AI consensus with human societies in different contexts. We further discuss three classical game problems for achieving IC: mechanism design, contract theory, and Bayesian persuasion, in addressing the perspectives, potentials, and challenges of solving ICSAP, and provide preliminary implementation conceptions.
翻译:人工智能(AI)与人类社会的深度融合对社会治理与安全产生了重要影响。尽管在解决AI对齐挑战方面取得了显著进展,但现有方法主要聚焦技术层面,往往忽视了AI系统复杂的社会技术特性,这可能导致开发与部署情境间的错位。为此,我们提出一个值得探索的新问题:激励相容性社会技术对齐问题(ICSAP)。我们期望借此呼吁更多研究者探索如何运用博弈论中的激励相容性(IC)原则,弥合技术要素与社会要素之间的鸿沟,在不同情境下维持AI与人类社会的共识。我们进一步讨论了实现IC的三个经典博弈问题——机制设计、契约理论与贝叶斯说服——在解决ICSAP中的视角、潜力与挑战,并提供了初步的实施构想。