Ensuring the integrity of software build artifacts is an increasingly important concern for modern software engineering, driven by increasingly sophisticated attacks on build systems, distribution channels, and development infrastructures. Reproducible builds $\unicode{x2013}$ where binaries built independently from the same source code can be verified to be bit-for-bit identical to the distributed artifacts $\unicode{x2013}$ provide a principled foundation for transparency and trust in software distribution. Despite their potential, the large-scale adoption of reproducible builds faces two significant challenges: achieving high reproducibility rates across vast software collections and establishing reproducibility monitoring infrastructure that can operate at very large scale. While recent studies have shown that high reproducibility rates are achievable at scale $\unicode{x2013}$ demonstrated by the Nix ecosystem achieving over 90% reproducibility on more than 80,000 packages $\unicode{x2013}$ the problem of effective reproducibility monitoring remains largely unsolved. In this work, we address the reproducibility monitoring challenge by introducing Lila, a decentralized system for reproducibility assessment tailored to the functional package management model. Lila enables distributed reporting of build results and aggregation into a reproducibility database, benefiting both practitioners and future empirical build reproducibility studies.
翻译:确保软件构建产物的完整性已成为现代软件工程日益重要的问题,这主要源于针对构建系统、分发渠道和开发基础设施日益复杂的攻击。可复现构建——即从相同源代码独立构建的二进制文件可被验证与分发的构建产物逐比特相同——为软件分发的透明度和信任提供了原则性基础。尽管潜力巨大,可复现构建的大规模应用仍面临两大挑战:在庞大的软件集合中实现高可复现率,以及建立能够超大规模运行的可复现性监控基础设施。虽然近期研究表明大规模实现高可复现率是可行的——Nix生态系统在超过80,000个软件包上实现超过90%的可复现率已证实这一点——但有效的可复现性监控问题在很大程度上仍未解决。本研究通过引入Lila来解决可复现性监控挑战,这是一个专为函数式包管理模型设计的去中心化可复现性评估系统。Lila支持构建结果的分布式报告并聚合至可复现性数据库,既有利于实践者,也为未来实证性的构建可复现性研究提供支持。