Supply chain attacks threaten open-source software ecosystems. This paper proposes a formal framework for quantifying trust in third-party software dependencies that is both formally checkable - formalized in satisfiability modulo theories (SMT) - while at the same time incorporating human factors, like the number of downloads, authors, and other metadata that are commonly used to identify trustworthy software in practice. We use data from both software analysis tools and metadata to build a first-order relational model of software dependencies; to obtain an overall "trust cost" combining these factors, we propose a formalization based on the minimum trust problem which asks for the minimum cost of a set of assumptions which can be used to prove that the code is safe. We implement these ideas in Cargo Sherlock, targeted for Rust libraries (crates), incorporating a list of candidate assumptions motivated by quantifiable trust metrics identified in prior work. Our evaluation shows that Cargo Sherlock can be used to identify synthetically generated supply chain attacks and known incidents involving typosquatted and poorly AI-maintained crates, and that its performance scales to Rust crates with many dependencies.
翻译:供应链攻击威胁着开源软件生态系统。本文提出了一种量化第三方软件依赖信任度的形式化框架,该框架既可通过可满足性模理论(SMT)进行形式化验证,同时又能纳入下载量、作者数量及其他在实践中常用于识别可信软件的元数据等人为因素。我们结合软件分析工具的数据与元数据构建了软件依赖的一阶关系模型;为获得综合这些因素的整体“信任成本”,我们基于最小信任问题提出了一种形式化方法,该问题旨在寻找能够证明代码安全性所需假设集合的最小成本。我们在针对Rust库(crates)的工具Cargo Sherlock中实现了这些理念,其中整合了由先前研究已识别的可量化信任指标所驱动的候选假设列表。评估表明,Cargo Sherlock能够有效识别人工生成的供应链攻击以及涉及仿冒包和AI维护不善crates的已知安全事件,且其性能可扩展至具有多重依赖的Rust crates。