With the increasing importance of distributed scientific workflows, there is a critical need to ensure Quality of Service (QoS) constraints, such as minimizing time or limiting execution to resource subsets. However, the unpredictable nature of workflow behavior, even with similar configurations, makes it difficult to provide QoS guarantees. For effective reasoning about QoS scheduling, we introduce QoSFlow, a performance modeling method that partitions a workflow's execution configuration space into regions with similar behavior. Each region groups configurations with comparable execution times according to a given statistical sensitivity, enabling efficient QoS-driven scheduling through analytical reasoning rather than exhaustive testing. Evaluation on three diverse workflows shows that QoSFlow's execution recommendations outperform the best-performing standard heuristic by 27.38%. Empirical validation confirms that QoSFlow's recommended configurations consistently match measured execution outcomes across different QoS constraints.
翻译:随着分布式科学工作流的重要性日益凸显,确保服务质量约束(如最小化时间或将执行限制在资源子集内)的需求变得至关重要。然而,即使配置相似,工作流行为的不可预测性也使得提供QoS保证变得困难。为了对QoS调度进行有效推理,我们提出了QoSFlow,这是一种性能建模方法,它将工作流的执行配置空间划分为具有相似行为的区域。每个区域根据给定的统计敏感度将具有可比执行时间的配置分组,从而通过分析推理而非穷举测试实现高效的QoS驱动调度。在三个不同工作流上的评估表明,QoSFlow的执行建议比性能最佳的标准启发式方法高出27.38%。实证验证证实,在不同QoS约束下,QoSFlow推荐的配置始终与实测执行结果相符。