In this paper, we present a novel black-box online controller that uses only end-to-end measurements over short segments, without internal instrumentation, and hill climbing to maximize goodput, defined as the throughput of requests that satisfy the service-level objective. We provide empirical evidence that this design is well-founded. Using this advance in LLM serving as a concrete example, we then discuss the importance of integrating system performance and sustainability metrics into Factsheets for organizations adopting AI systems.
翻译:本文提出一种新颖的黑盒在线控制器,该控制器仅利用短时段内的端到端测量数据(无需内部监测),通过爬山算法最大化优质吞吐量——即满足服务水平目标的请求吞吐量。我们提供了实证证据表明该设计具有坚实基础。以大语言模型服务领域的这一进展为具体案例,本文进而探讨了将系统性能与可持续性指标整合至AI系统采用机构的事实清单中的重要性。