Modern distributed pipelined query engines either do not support intra-query fault tolerance or employ high-overhead approaches such as persisting intermediate outputs or checkpointing state. In this work, we present write-ahead lineage, a novel fault recovery technique that combines Spark's lineage-based replay and write-ahead logging. Unlike Spark, where the lineage is determined before query execution, write-ahead lineage persistently logs lineage at runtime to support dynamic task dependencies in pipelined query engines. Since only KB-sized lineages are persisted instead of MB-sized intermediate outputs, the normal execution overhead is minimal compared to spooling or checkpointing based approaches. To ensure fast fault recovery times, tasks only consume intermediate outputs with persisted lineage, preventing global rollbacks upon failure. In addition, lost tasks from different stages can be recovered in a pipelined parallel manner. We implement write-ahead lineage in a distributed pipelined query engine called Quokka. We show that Quokka is around 2x faster than SparkSQL on the TPC-H benchmark with similar fault recovery performance.
翻译:现代分布式管道化查询引擎要么不支持查询内故障容错,要么采用持久化中间输出或检查点状态等高开销方法。本文提出预写日志谱系(write-ahead lineage),一种结合Spark谱系重放与预写日志的新型故障恢复技术。与Spark在查询执行前确定谱系不同,预写日志谱系在运行时持久化记录谱系,以支持管道化查询引擎中的动态任务依赖。由于仅持久化KB级别的谱系而非MB级别的中间输出,相比后台批处理或基于检查点的方法,正常执行开销极低。为确保快速故障恢复,任务仅消费具有持久化谱系的中间输出,从而避免故障时的全局回滚。此外,不同阶段的丢失任务可通过管道化并行方式恢复。我们在名为Quokka的分布式管道化查询引擎中实现了预写日志谱系。实验表明,Quokka在TPC-H基准测试中的性能约为SparkSQL的2倍,且具有相似的故障恢复性能。