Machine learning researchers and practitioners steadily enlarge the multitude of successful learning models. They achieve this through in-depth theoretical analyses and experiential heuristics. However, there is no known general-purpose procedure for rigorously evaluating whether newly proposed models indeed successfully learn from data. We show that such a procedure cannot exist. For PAC binary classification, uniform and universal online learning, and exact learning through teacher-learner interactions, learnability is in general undecidable, both in the sense of independence of the axioms in a formal system and in the sense of uncomputability. Our proofs proceed via computable constructions that encode the consistency problem for formal systems and the halting problem for Turing machines into whether certain function classes are trivial/finite or highly complex, which we then relate to whether these classes are learnable via established characterizations of learnability through complexity measures. Our work shows that undecidability appears in the theoretical foundations of artificial intelligence: There is no one-size-fits-all algorithm for deciding whether a machine learning model can be successful. We cannot in general automatize the process of assessing new learning models.
翻译:机器学习研究人员与实践者不断拓展成功学习模型的多样性。他们通过深入的理论分析与经验启发实现这一目标。然而,目前尚不存在一种通用程序能够严格评估新提出的模型是否确实从数据中成功学习。我们证明这类程序不可能存在。对于PAC二分类学习、一致性与通用在线学习,以及通过师生交互实现的精确学习,可学习性在一般意义上是不可判定的——既表现为形式系统中公理独立性的含义,也表现为不可计算性的含义。我们的证明通过可计算构造实现:将形式系统的一致性问题和图灵机的停机问题编码为函数类的平凡性/有限性判定,进而借助已建立的可学习性复杂度测度特征,将上述问题与函数类的可学习性关联。本研究表明不可判定性出现在人工智能的理论基础中:不存在能判定机器学习模型是否成功的通用算法。我们无法一般性地自动化评估新学习模型的过程。