ConFL: Constraint-guided Fuzzing for Machine Learning Framework

As machine learning gains prominence in various sectors of society for automated decision-making, concerns have risen regarding potential vulnerabilities in machine learning (ML) frameworks. Nevertheless, testing these frameworks is a daunting task due to their intricate implementation. Previous research on fuzzing ML frameworks has struggled to effectively extract input constraints and generate valid inputs, leading to extended fuzzing durations for deep execution or revealing the target crash. In this paper, we propose ConFL, a constraint-guided fuzzer for ML frameworks. ConFL automatically extracting constraints from kernel codes without the need for any prior knowledge. Guided by the constraints, ConFL is able to generate valid inputs that can pass the verification and explore deeper paths of kernel codes. In addition, we design a grouping technique to boost the fuzzing efficiency. To demonstrate the effectiveness of ConFL, we evaluated its performance mainly on Tensorflow. We find that ConFL is able to cover more code lines, and generate more valid inputs than state-of-the-art (SOTA) fuzzers. More importantly, ConFL found 84 previously unknown vulnerabilities in different versions of Tensorflow, all of which were assigned with new CVE ids, of which 3 were critical-severity and 13 were high-severity. We also extended ConFL to test PyTorch and Paddle, 7 vulnerabilities are found to date.

翻译：随着机器学习在社会各领域自动化决策中日益凸显其重要性，机器学习框架的潜在漏洞问题引发广泛关注。然而，由于框架实现的复杂性，对其进行测试极具挑战性。现有针对机器学习框架的模糊测试研究难以有效提取输入约束并生成合法输入，导致深度执行或目标崩溃的模糊测试耗时较长。本文提出ConFL——一种面向机器学习框架的约束引导式模糊测试工具。ConFL无需任何先验知识即可从内核代码中自动提取约束。在约束引导下，ConFL能够生成可通过验证并探索内核代码更深路径的合法输入。此外，我们设计了一种分组技术来提升模糊测试效率。为验证ConFL的有效性，我们主要在TensorFlow上评估其性能。实验表明，与现有最优模糊测试工具相比，ConFL能够覆盖更多代码行并生成更多合法输入。更重要的是，ConFL在TensorFlow不同版本中发现了84个先前未知的漏洞，所有漏洞均被分配新的CVE编号，其中3个为严重级别，13个为高级别。我们还将ConFL扩展至PyTorch和PaddlePaddle框架，迄今已发现7个漏洞。

相关内容

Machine Learning

关注 2251

机器学习（Machine Learning）是一个研究计算学习方法的国际论坛。该杂志发表文章，报告广泛的学习方法应用于各种学习问题的实质性结果。该杂志的特色论文描述研究的问题和方法，应用研究和研究方法的问题。有关学习问题或方法的论文通过实证研究、理论分析或与心理现象的比较提供了坚实的支持。应用论文展示了如何应用学习方法来解决重要的应用问题。研究方法论文改进了机器学习的研究方法。所有的论文都以其他研究人员可以验证或复制的方式描述了支持证据。论文还详细说明了学习的组成部分，并讨论了关于知识表示和性能任务的假设。官网地址：http://dblp.uni-trier.de/db/journals/ml/

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日