Many system management runtimes (SMRs), such as resource management and power management techniques, rely on quality-of-service (QoS) metrics, such as tail latency or throughput, as feedback. These QoS metrics are generally neither observable with hardware performance counters nor directly observable within the OS kernel. This introduces complexity and overhead in instrumenting the application and integrating QoS performance metric feedback with many management runtimes. To bridge this gap, we introduced eBeeMetrics, an eBPF-based library framework to accurately observe application-level metrics derived from only eBPF-observable events, such as system calls. eBeeMetrics can be used as a drop-in replacement to decouple system management runtimes from QoS metric feedback reporting, or can supplement existing QoS metrics to better identify server-side dynamics. eBeeMetrics achieves a strong correlation with real-world measured throughput and latency metrics across various latency-sensitive workloads. The eBeeMetrics tool is open-source; the source code is available at: https://github.com/Ibnathism/eBeeMetrics.
翻译:许多系统管理运行时(SMR,例如资源管理和功耗管理技术)依赖于服务质量(QoS)指标(如尾延迟或吞吐量)作为反馈。这些QoS指标通常既无法通过硬件性能计数器观测,也无法在操作系统内核中直接观测。这增加了应用程序检测以及与众多管理运行时集成QoS性能指标反馈的复杂性和开销。为弥补这一空白,我们提出了eBeeMetrics——一种基于eBPF的库框架,能够仅通过eBPF可观测事件(如系统调用)精确观测应用层指标。eBeeMetrics可作为即插即用模块,解耦系统管理运行时与QoS指标反馈报告,也可补充现有QoS指标以更精准识别服务器端动态。在多种延迟敏感型负载下,eBeeMetrics与实测吞吐量和延迟指标呈现强相关性。该工具为开源项目,源代码见:https://github.com/Ibnathism/eBeeMetrics。