Context: Modern open-source operating systems consist of numerous independent packages crafted by countless developers worldwide. To effectively manage this diverse array of software originating from various entities, Linux distributions have devised package management tools to streamline the process. Despite offering convenience in software installation, systems like Ubuntu's apt may obscure the freshness of its constituent packages when compared to the upstream projects. Objective: The focus of this research is to develop a method to systematically identify packages within a Linux distribution that show low development activity between versions of the OSS projects included in a release. The packages within a Linux distribution utilize a heterogeneous mix of versioning strategies in their upstream projects and these versions are passed through to the package manager, often with distribution specific version information appended, making this work both interesting and non-trivial. Method: We use regular expressions to extract the epoch and upstream project major, minor, and patch versions for more than 6000 packages in the Ubuntu distribution, documenting our process for assigning these values for projects that do not follow the semantic versioning standard. Using the libyears metric for the CHAOS project, we calculate the freshness of a subset of the packages within a distribution against the latest upstream project release. This led directly to the development of Package Version Activity Classifier (PVAC), a novel method for systematically assessing the staleness of packages across multiple distribution releases.
翻译:背景:现代开源操作系统由全球无数开发者创建的众多独立软件包构成。为有效管理这些来自不同实体的多样化软件,Linux发行版设计了包管理工具以简化流程。尽管Ubuntu的apt等系统为软件安装提供了便利,但与上游项目相比,这些系统可能会掩盖其组成软件包的新旧程度。目标:本研究的重点是开发一种系统性方法,用于识别Linux发行版中在发布版本所含开源软件项目版本间表现出低开发活动的软件包。Linux发行版中的软件包在其上游项目中采用异构的版本控制策略,这些版本信息会传递至包管理器,并通常附加发行版特定的版本信息,使得本项研究兼具趣味性与复杂性。方法:我们使用正则表达式为Ubuntu发行版中6000多个软件包提取纪元版本及上游项目的主版本、次版本和补丁版本,并记录了为非语义化版本标准的项目分配这些值的过程。借助CHAOS项目的libyears度量指标,我们计算了发行版内部分软件包相对于最新上游项目发布的新鲜度。这直接促成了包版本活动分类器(PVAC)的开发,这是一种用于系统性评估跨多个发行版版本中软件包陈旧程度的新方法。