In this work, we present COSTREAM, a novel learned cost model for Distributed Stream Processing Systems that provides accurate predictions of the execution costs of a streaming query in an edge-cloud environment. The cost model can be used to find an initial placement of operators across heterogeneous hardware, which is particularly important in these environments. In our evaluation, we demonstrate that COSTREAM can produce highly accurate cost estimates for the initial operator placement and even generalize to unseen placements, queries, and hardware. When using COSTREAM to optimize the placements of streaming operators, a median speed-up of around 21x can be achieved compared to baselines.
翻译:本文提出COSTREAM,一种面向分布式流处理系统的创新学习型代价模型,能够精确预测边缘-云环境中流查询的执行代价。该代价模型可用于在异构硬件上实现算子的初始部署,这在此类环境中尤为关键。实验评估表明,COSTREAM不仅能为初始算子部署生成高精度代价估计,还能泛化至未见过的部署方案、查询语句及硬件配置。利用COSTREAM优化流处理算子部署时,与基线方法相比可实现约21倍的中位加速比。