Concurrent workloads often extract insights from high-throughput, real-time data streams. Existing stream processing engines isolate each query's resources, ensuring robust performance but incurring high infrastructure costs. In contrast, sharing work reduces the amount of necessary resources but introduces inter-query interference, leading to performance degradation for some queries. We introduce FunShare, a stream-processing system that improves resource efficiency without compromising performance by dynamically grouping queries based on their performance characteristics. FunShare strategically relaxes query interdependencies and minimizes redundant computation while preserving individual query performance. It achieves this by using an adaptive optimization framework that monitors execution metrics, accurately estimates computation overlaps, and reconfigures execution plans on the fly in response to changes in the underlying data streams. Our evaluation demonstrates that FunShare minimizes resource consumption compared to isolated execution while maintaining or improving throughput for all queries.
翻译:并发工作负载通常从高吞吐量、实时数据流中提取信息。现有流处理引擎隔离每个查询的资源,确保了稳健的性能,但导致高昂的基础设施成本。相比之下,工作共享减少了所需资源量,但引入了查询间干扰,导致某些查询性能下降。我们提出了FunShare,一个流处理系统,该系统通过根据查询的性能特征动态分组,在不影响性能的情况下提高资源效率。FunShare策略性地放松查询间的依赖关系,并在保持单个查询性能的同时最小化冗余计算。它通过使用自适应优化框架来实现这一点,该框架监控执行指标,准确估计计算重叠,并根据底层数据流的变化实时重新配置执行计划。我们的评估表明,与隔离执行相比,FunShare在最小化资源消耗的同时,维持或提高了所有查询的吞吐量。