Although the cloud has reached a state of robustness, the burden of using its resources falls on the shoulders of programmers who struggle to keep up with ever-growing cloud infrastructure services and abstractions. As a result, state management, scaling, operation, and failure management of scalable cloud applications, require disproportionately more effort than developing the applications' actual business logic. Our vision aims to raise the abstraction level for programming scalable cloud applications by compiling stateful entities -- a programming model enabling imperative transactional programs authored in Python -- into stateful streaming dataflows. We propose a compiler pipeline that analyzes the abstract syntax tree of stateful entities and transforms them into an intermediate representation based on stateful dataflow graphs. It then compiles that intermediate representation into different dataflow engines, leveraging their exactly-once message processing guarantees to prevent state or failure management primitives from "leaking" into the level of the programming model. Preliminary experiments with a proof of concept implementation show that despite program transformation and translation to dataflows, stateful entities can perform at sub-100ms latency even for transactional workloads.
翻译:尽管云已实现稳健性,但使用其资源的负担仍落在程序员肩上,他们难以跟上不断增长的云基础设施服务与抽象层。因此,可扩展云应用的状态管理、弹性伸缩、运维及故障管理工作所需的投入,远超开发实际业务逻辑的精力。我们的愿景是通过将有状态实体(一种支持用Python编写命令式事务程序的编程模型)编译为有状态流式数据流,来提升可扩展云应用的编程抽象层级。我们提出一种编译器流水线,通过分析有状态实体的抽象语法树,将其转换为基于有状态数据流图的中间表示。随后,该中间表示被编译到不同的数据流引擎,利用其恰好一次消息处理保证,防止状态或故障管理原语"泄露"到编程模型层级。基于概念验证实现的初步实验表明,尽管经历了程序转换与数据流转化,有状态实体在事务型工作负载下仍可实现亚100毫秒的延迟性能。