Configuring stream processing systems for efficient performance, especially in cloud-native deployments, is a challenging and largely manual task. We present an experiment-driven approach for automated configuration optimization that combines three phases: Latin Hypercube Sampling for initial exploration, Simulated Annealing for guided stochastic search, and Hill Climbing for local refinement. The workflow is integrated with the cloud-native Theodolite benchmarking framework, enabling automated experiment orchestration on Kubernetes and early termination of underperforming configurations. In an experimental evaluation with Kafka Streams and a Kubernetes-based cloud testbed, our approach identifies configurations that improve throughput by up to 23% over the default. The results indicate that Latin Hypercube Sampling with early termination and Simulated Annealing are particularly effective in navigating the configuration space, whereas additional fine-tuning via Hill Climbing yields limited benefits.
翻译:为流处理系统配置高效性能(尤其在云原生部署场景中)是一项具有挑战性且高度依赖人工的任务。本文提出一种实验驱动的自动化配置优化方法,该方法融合三个阶段:采用拉丁超立方采样进行初始探索,利用模拟退火算法执行引导式随机搜索,并借助爬山算法完成局部优化。该工作流与云原生基准测试框架Theodolite集成,支持在Kubernetes上实现自动化实验编排及对低效配置的早期终止机制。基于Kafka Streams与Kubernetes云测试平台的实验评估表明,本方法所发现的配置方案相较于默认配置可提升高达23%的吞吐量。结果显示:结合早期终止机制的拉丁超立方采样与模拟退火算法在配置空间探索中效果显著,而通过爬山算法进行的额外微调则收益有限。