What is a "bug"? On subjectivity, epistemic power, and implications for software research

Considerable effort in software research and practice is spent on bugs. Finding, reporting, tracking, triaging, attempting to fix them automatically, detecting "bug smells" -these comprise a substantial portion of large projects' time and development cost, and are of significant interest to researchers in Software Engineering, Programming Languages, and beyond. But, what is a bug, exactly? While segmentation faults rarely spark joy, most bugs are not so clear cut. Per the Oxford English Dictionary, the word "bug" has been a colloquialism for an engineering "defect" at least since the 1870s. Most modern software-oriented definitions speak to a disconnect between what a developer intended and what a program actually does. Formal verification, from its inception, has developed means to identify deviations from a formal specification, expected to more or less fully encode desired behavior. However, software is rarely accompanied by full and formal specifications, and this intention is instead treated as implicit or partially-documented at best. The International Software Testing Qualifications board writes: "A human being can make an error (mistake), which produces a defect (fault, bug) in the program code, or in a document. If a defect in code is executed, the system may fail to do what it should do (or do something it shouldn't), causing a failure. Defects may result in failures, but not all [do]". Most sources forsake this precision. The influential paper "Finding bugs is easy" begins by saying "bug patterns are code idioms that are often errors"-with no particular elaboration. Other work relies on imperfect practical proxies for specifications. For example, in automatic program repair research, a bug corresponds to a failing test case: when the test passes, the bug is considered fixed. However, when we interrogate fairly straightforward definitions, they start to break down...

翻译：软件研究与实践中大量精力耗费于“缺陷”之上。发现缺陷、报告、追踪、分类、尝试自动修复、检测“缺陷气味”——这些工作占据了大型项目相当比例的时间与开发成本，并且是软件工程、程序设计语言等领域研究者重点关注的对象。但究竟何为“缺陷”？尽管段错误鲜少带来愉悦，但大多数缺陷并未如此清晰可辨。根据《牛津英语词典》，自19世纪70年代起，“bug”一词便作为工程中“缺陷”的通俗说法而存在。现代面向软件的定义大多指向开发者意图与程序实际行为之间的偏差。形式化验证自诞生之初就致力于识别与形式规约（通常期望能完整编码预期行为）之间的偏差。然而，软件极少配备完整的形式化规约，这种意图充其量只能被隐式处理或部分文档化。国际软件测试资格认证委员会写道：“人可能犯错误（失误），从而在程序代码或文档中产生缺陷（故障、缺陷）。如果执行代码中的缺陷，系统可能无法完成应做之事（或做出不应做之事），导致失效。缺陷可能导致失效，但并非所有缺陷都会如此。”多数资料来源则缺乏这种精确性。颇具影响力的论文《Finding bugs is easy》开篇即称“缺陷模式是常为错误的代码惯用法”——对此并未展开阐述。其他研究则依赖不完善的实践代理来替代规约。例如在自动程序修复研究中，缺陷对应失败的测试用例：当测试通过时即认为缺陷已被修复。然而，当我们审视这些看似直白的定义时，它们便开始瓦解……