\textbf{Context:} Policy-as-Code (PaC) has become a foundational approach for embedding governance, compliance, and security requirements directly into software systems. While organizations increasingly adopt PaC tools, the software engineering community lacks an empirical understanding of how these tools are used in real-world development practices. \textbf{Objective:} This paper aims to bridge this gap by conducting the first large-scale study of PaC usage in open-source software. Our goal is to characterize how PaC tools are adopted, what purposes they serve, and what governance activities they support across diverse software ecosystems. \textbf{Method:} We analyzed 399 GitHub repositories using nine widely adopted PaC tools. Our mixed-methods approach combines quantitative analysis of tool usage and project characteristics with a qualitative investigation of policy files. We further employ a Large Language Model (LLM)--assisted classification pipeline, refined through expert validation, to derive a taxonomy of PaC usage consisting of 5 categories and 15 sub-categories. \textbf{Results:} Our study reveals substantial diversity in PaC adoption. PaC tools are frequently used in early-stage projects and are heavily oriented toward governance, configuration control, and documentation. We also observe emerging PaC usage in MLOps pipelines and strong co-usage patterns, such as between OPA and Gatekeeper. Our taxonomy highlights recurring governance intents. \textbf{Conclusion:} Our findings offer actionable insights for practitioners and tool developers. They highlight concrete usage patterns, emphasize actual PaC usage, and motivate opportunities for improving tool interoperability. This study lays the empirical foundation for future research on PaC practices and their role in ensuring trustworthy, compliant software systems.
翻译:\textbf{背景:}策略即代码已成为将治理、合规性和安全性要求直接嵌入软件系统的基础性方法。尽管各组织越来越多地采用PaC工具,但软件工程界对于这些工具在真实开发实践中的使用方式仍缺乏实证理解。\textbf{目标:}本文旨在通过开展首次针对开源软件中PaC使用的大规模研究来弥合这一差距。我们的目标是刻画PaC工具如何被采用、服务于何种目的,以及在不同软件生态系统中支持哪些治理活动。\textbf{方法:}我们分析了使用九种广泛采用的PaC工具的399个GitHub仓库。我们的混合方法结合了对工具使用情况和项目特征的定量分析,以及对策略文件的定性调查。我们进一步采用了一个经过专家验证优化的大语言模型辅助分类流程,推导出一个包含5个类别和15个子类别的PaC使用分类体系。\textbf{结果:}我们的研究揭示了PaC采用的显著多样性。PaC工具常在早期项目中使用,并高度侧重于治理、配置控制和文档编制。我们还观察到PaC在MLOps流水线中的新兴应用,以及强烈的共现使用模式,例如OPA与Gatekeeper之间的搭配。我们的分类体系突显了反复出现的治理意图。\textbf{结论:}我们的发现为从业者和工具开发者提供了可操作的见解。它们揭示了具体的使用模式,强调了实际的PaC应用情况,并指出了改进工具互操作性的机遇。本研究为未来关于PaC实践及其在确保可信、合规软件系统中的作用的研究奠定了实证基础。