This article raises an important and challenging workload characterization issue: can we uncover each critical component across the stacks contributing what percentages to any specific bottleneck? The typical critical components include languages, programming frameworks, runtime environments, instruction set architectures (ISA), operating systems (OS), and microarchitecture. Tackling this issue could help propose a systematic methodology to guide the software and hardware co-design and critical component optimizations. We propose a whole-picture workload characterization (WPC) methodology to answer the above issue. In essence, WPC is an iterative ORFE loop consisting of four steps: Observation, Reference, Fusion, and Exploration. WPC observes different level data (observation), fuses and normalizes the performance data (fusion) with respect to the well-designed standard reference workloads suite (reference), and explores the software and hardware co-design space (exploration) to investigate the impacts of critical components across the stacks. We build and open-source the WPC tool. Our evaluations confirm WPC can quantitatively reveal the contributions of the language, framework, runtime environment, ISA, OS, and microarchitecture to the primary pipeline efficiency.
翻译:本文提出一个重要且具有挑战性的工作负载表征问题:能否揭示各栈层中每个关键组件对任何特定瓶颈的贡献百分比?典型的关键组件包括编程语言、编程框架、运行时环境、指令集架构(ISA)、操作系统(OS)和微架构。解决这一问题有助于提出一套系统性的方法论,以指导软硬件协同设计和关键组件优化。我们提出了一种名为“全局工作负载表征”(WPC)的方法来回答上述问题。本质上,WPC是一个由四个步骤组成的迭代式ORFE循环:观察(Observation)、参考(Reference)、融合(Fusion)和探索(Exploration)。WPC通过观察不同层级的数据(观察),基于精心设计的标准参考工作负载套件(参考)对性能数据进行融合与归一化(融合),并探索软硬件协同设计空间(探索),以研究各栈层关键组件的影响。我们构建并开源了WPC工具。评估结果表明,WPC能够定量揭示编程语言、框架、运行时环境、ISA、操作系统和微架构对主流水线效率的贡献。