The rapid growth of software vulnerabilities has turned cyber threat intelligence analysis into a challenging data mining problem over heterogeneous and continuously changing sources. Public repositories such as the National Vulnerability Database (NVD), Common Vulnerabilities and Exposures (CVE), Common Weakness Enumeration (CWE), Exploit Database (EDB), and CVE Details provide valuable information, but their record-centric schemas make it difficult to capture cross-source relationships among vulnerabilities, weaknesses, exploits, affected products, vendors, and references. Existing graph-based vulnerability resources highlight the value of relational threat modelling, yet many remain static, offline, or difficult to access for downstream graph mining. This paper presents VulLink, a deployed, dynamic, and open-access vulnerability graph database for cybersecurity data mining. VulLink integrates multiple public repositories through an automated Extract-Transform-Load (ETL) pipeline that converts isolated, record-centric vulnerability data into a continuously updated graph database with typed entities and explicit cross-source relationships. It provides an interactive Web interface and public API for exploring, querying, and exporting mining-ready vulnerability subgraphs. It also provides pre-computed embeddings of vulnerability descriptions generated by pretrained language models, which users can query and download by model and embedding dimension as semantic features for downstream mining tasks such as exploitability prediction. To demonstrate the practical utility of VulLink, we implement a downstream exploitability prediction use case that leverages heterogeneous graph context and semantic vulnerability features. The VulLink platform, including the Web interface, public API, source code, and deployment resources, is publicly available online.
翻译:暂无翻译