With the rapid advancement of artificial intelligence technology, the usage of machine learning models is gradually becoming part of our daily lives. High-quality models rely not only on efficient optimization algorithms but also on the training and learning processes built upon vast amounts of data and computational power. However, in practice, due to various challenges such as limited computational resources and data privacy concerns, users in need of models often cannot train machine learning models locally. This has led them to explore alternative approaches such as outsourced learning and federated learning. While these methods address the feasibility of model training effectively, they introduce concerns about the trustworthiness of the training process since computations are not performed locally. Similarly, there are trustworthiness issues associated with outsourced model inference. These two problems can be summarized as the trustworthiness problem of model computations: How can one verify that the results computed by other participants are derived according to the specified algorithm, model, and input data? To address this challenge, verifiable machine learning (VML) has emerged. This paper presents a comprehensive survey of zero-knowledge proof-based verifiable machine learning (ZKP-VML) technology. We first analyze the potential verifiability issues that may exist in different machine learning scenarios. Subsequently, we provide a formal definition of ZKP-VML. We then conduct a detailed analysis and classification of existing works based on their technical approaches. Finally, we discuss the key challenges and future directions in the field of ZKP-based VML.
翻译:随着人工智能技术的快速发展,机器学习模型的使用正逐渐融入日常生活。高质量模型不仅依赖于高效的优化算法,更建立在海量数据与算力支撑的训练学习过程之上。然而在实际应用中,受限于计算资源不足、数据隐私保护等挑战,模型需求方往往无法在本地完成模型训练,转而探索外包学习或联邦学习等替代方案。尽管这些方法有效解决了模型训练的可行性问题,但由于计算过程不在本地执行,引发了关于训练过程可信性的担忧。类似地,外包模型推理同样存在可信性问题。这两类问题可归纳为模型计算的可信性问题:如何验证其他参与方返回的结果确实是根据指定算法、模型和输入数据计算得到的?为应对这一挑战,可验证机器学习技术应运而生。本文全面综述了基于零知识证明的可验证机器学习技术。首先分析不同机器学习场景中潜在的可验证性问题,继而给出ZKP-VML的形式化定义,随后依据技术路线对现有工作进行详细分类与分析,最后探讨基于ZKP的VML领域的关键挑战与未来发展方向。