Imagining and building wise machines: The centrality of AI metacognition

Samuel G. B. Johnson,Amir-Hossein Karimi,Yoshua Bengio,Nick Chater,Tobias Gerstenberg,Kate Larson,Sydney Levine,Melanie Mitchell,Iyad Rahwan,Bernhard Schölkopf,Igor Grossmann

from arxiv, 26 pages, 1 figure, 2 tables

Recent advances in artificial intelligence (AI) have produced systems capable of increasingly sophisticated performance on cognitive tasks. However, AI systems still struggle in critical ways: unpredictable and novel environments (robustness), lack of transparency in their reasoning (explainability), challenges in communication and commitment (cooperation), and risks due to potential harmful actions (safety). We argue that these shortcomings stem from one overarching failure: AI systems lack wisdom. Drawing from cognitive and social sciences, we define wisdom as the ability to navigate intractable problems - those that are ambiguous, radically uncertain, novel, chaotic, or computationally explosive - through effective task-level and metacognitive strategies. While AI research has focused on task-level strategies, metacognition - the ability to reflect on and regulate one's thought processes - is underdeveloped in AI systems. In humans, metacognitive strategies such as recognizing the limits of one's knowledge, considering diverse perspectives, and adapting to context are essential for wise decision-making. We propose that integrating metacognitive capabilities into AI systems is crucial for enhancing their robustness, explainability, cooperation, and safety. By focusing on developing wise AI, we suggest an alternative to aligning AI with specific human values - a task fraught with conceptual and practical difficulties. Instead, wise AI systems can thoughtfully navigate complex situations, account for diverse human values, and avoid harmful actions. We discuss potential approaches to building wise AI, including benchmarking metacognitive abilities and training AI systems to employ wise reasoning. Prioritizing metacognition in AI research will lead to systems that act not only intelligently but also wisely in complex, real-world situations.

翻译：近年来人工智能（AI）领域的进展已催生出能够在认知任务上实现日益复杂性能的系统。然而，AI系统仍在关键方面存在不足：面对不可预测且新颖的环境时表现不稳定（鲁棒性）、推理过程缺乏透明度（可解释性）、沟通与承诺机制存在挑战（协作性），以及潜在有害行为导致的风险（安全性）。我们认为这些缺陷源于一个根本性缺失：AI系统缺乏智慧。借鉴认知科学与社会科学的成果，我们将智慧定义为通过有效的任务层与元认知策略来应对棘手问题的能力——这些问题通常具有模糊性、极端不确定性、新颖性、混沌性或计算爆炸性特征。尽管AI研究已聚焦于任务层策略，但元认知——即反思与调节自身思维过程的能力——在AI系统中尚未得到充分发展。对人类而言，认识自身知识局限、考量多元视角、适应情境变化等元认知策略，是实现明智决策的关键。我们提出，将元认知能力整合至AI系统对于提升其鲁棒性、可解释性、协作性与安全性至关重要。通过聚焦于开发具有智慧的人工智能，我们提出了一条替代路径，以规避将AI与特定人类价值观对齐时面临的概念与实践双重困境。相反，具备智慧的AI系统能够审慎应对复杂情境，统筹多元人类价值观，并规避有害行为。本文探讨了构建智慧AI的潜在路径，包括建立元认知能力评估基准，以及训练AI系统运用明智推理。在AI研究中优先发展元认知，将推动系统不仅在复杂现实情境中智能行事，更能实现智慧决策。