This paper presents a trace-based simulation methodology for constructing representations of workload-allocator interaction. We use two-dimensional rectangular bin packing (2DBP) as our foundation. Classical 2DBP algorithms minimize their products' makespan, but virtual memory systems employing demand paging deem such a criterion inappropriate. We view an allocator's placement decisions as a solution to a 2DBP instance, optimizing some unknown criterion particular to that allocator's policy. Our end product is a compact data structure that fits e.g. the simulation of 80 million requests in a 350 MiB file. By design, it is concerned with events residing entirely in virtual memory; no information on memory accesses, indexing costs or any other factor is kept. We bootstrap our contribution's significance by exploring its relationship to maximum resident set size (RSS). Our baseline is the assumption that less fragmentation amounts to smaller peak RSS. We thus define a fragmentation metric in the 2DBP substrate and compute it for 28 workloads linked to 4 modern allocators. We also measure peak RSS for the 112 resulting pairs. Our metric exhibits a strong monotonic relationship (Spearman coefficient $\rho>0.65$) in half of those cases: allocators achieving better 2DBP placements yield $9\%$-$30\%$ smaller peak RSS, with the trends remaining consistent across two different machines. Considering our representation's minimalism, the presented empirical evidence is a robust indicator of its potency. If workload-allocator interplay in the virtual address space suffices to evaluate a novel fragmentation definition, numerous other useful applications of our tool can be studied. Both augmenting 2DBP and exploring alternative computations on it provide ample fertile ground for future research.
翻译:本文提出一种基于轨迹的仿真方法论,用于构建工作负载与分配器交互的表示模型。我们以二维矩形装箱(2DBP)为理论基础。经典2DBP算法以最小化产物的完工时间为目标,但采用请求分页的虚拟内存系统认为该准则并不适用。我们将分配器的放置决策视为对2DBP实例的求解过程,其优化目标遵循该分配器策略特有的未知准则。最终产物是一种紧凑型数据结构,例如可在350 MiB文件中模拟8000万次请求。通过设计,该结构仅关注虚拟内存空间内的事件,不保留内存访问、索引开销或其他任何因素的信息。我们通过探索其与最大驻留集大小(RSS)的关联来论证该贡献的重要性,基准假设是碎片化程度越低对应峰值RSS越小。据此,我们在2DBP基底上定义碎片化度量指标,并对与4种现代分配器相关联的28个工作负载进行计算。同时测量112组组合的峰值RSS。其中半数案例的度量指标呈现强单调关系(斯皮尔曼系数ρ>0.65):实现更优2DBP放置方案的分配器可使峰值RSS降低9%-30%,该趋势在不同机器上保持一致。考虑到所提表示模型的极简性,上述实证证据有力证明了其有效性。若虚拟地址空间中分配器与工作负载的交互足以评估新型碎片化定义,则本工具的其他诸多实用应用场景值得进一步探索。无论是改进2DBP算法还是探索其替代计算方案,均为未来研究提供了广阔沃土。