KernJC: Automated Vulnerable Environment Generation for Linux Kernel Vulnerabilities

Linux kernel vulnerability reproduction is a critical task in system security. To reproduce a kernel vulnerability, the vulnerable environment and the Proof of Concept (PoC) program are needed. Most existing research focuses on the generation of PoC, while the construction of environment is overlooked. However, establishing an effective vulnerable environment to trigger a vulnerability is challenging. Firstly, it is hard to guarantee that the selected kernel version for reproduction is vulnerable, as the vulnerability version claims in online databases can occasionally be spurious. Secondly, many vulnerabilities can not be reproduced in kernels built with default configurations. Intricate non-default kernel configurations must be set to include and trigger a kernel vulnerability, but less information is available on how to recognize these configurations. To solve these challenges, we propose a patch-based approach to identify real vulnerable kernel versions and a graph-based approach to identify necessary configs for activating a specific vulnerability. We implement these approaches in a tool, KernJC, automating the generation of vulnerable environments for kernel vulnerabilities. To evaluate the efficacy of KernJC, we build a dataset containing 66 representative real-world vulnerabilities with PoCs from kernel vulnerability research in the past five years. The evaluation shows that KernJC builds vulnerable environments for all these vulnerabilities, 48.5% of which require non-default configs, and 4 have incorrect version claims in the National Vulnerability Database (NVD). Furthermore, we conduct large-scale spurious version detection on kernel vulnerabilities and identify 128 vulnerabilities which have spurious version claims in NVD. To foster future research, we release KernJC with the dataset in the community.

翻译：Linux内核漏洞复现是系统安全领域的关键任务。复现内核漏洞需要脆弱环境与概念验证程序（PoC）。现有研究大多聚焦于PoC生成，而环境构建问题则被忽视。然而，构建有效的脆弱环境以触发漏洞颇具挑战性。首先，难以保证所选内核版本存在漏洞——在线数据库中关于漏洞版本的声明偶尔存在虚假信息。其次，许多漏洞无法在默认配置构建的内核中复现，必须设置复杂的非默认内核配置才能包含并触发内核漏洞，但关于如何识别这些配置的信息较为匮乏。为解决这些挑战，本文提出一种基于补丁的方法来识别真实存在漏洞的内核版本，以及一种基于图的方法来识别激活特定漏洞所需的配置。我们将这些方法实现于工具KernJC中，可自动生成内核漏洞的脆弱环境。为评估KernJC效能，我们构建了包含过去五年内核漏洞研究中66个代表性真实漏洞及其PoC的数据集。评估表明，KernJC为所有漏洞构建了脆弱环境，其中48.5%需要非默认配置，4个漏洞在美国国家漏洞数据库（NVD）中存在版本声明错误。此外，我们针对内核漏洞开展大规模虚假版本检测，识别出NVD中128个存在虚假版本声明的漏洞。为促进未来研究，我们向社区公开了KernJC及其数据集。