Modern hardware heterogeneity brings efficiency and performance opportunities for analytical query processing. In the presence of continuous data volume and complexity growth, bridging the gap between recent hardware advancements and the data processing tools ecosystem is paramount for improving the speed of ETL and model development. In this paper, we present a comprehensive overview of existing analytical query processing approaches as well as the use and design of systems that use heterogeneous hardware for the task. We then analyze state-of-the-art solutions and identify missing pieces. The last two chapters discuss the identified problems and present our view on how the ecosystem should evolve.
翻译:现代硬件的异构性为分析查询处理带来了效率和性能上的机遇。随着数据量和复杂度的持续增长,弥合最新硬件进步与数据处理工具生态系统之间的差距,对于提升ETL和模型开发的速度至关重要。本文全面概述了现有的分析查询处理方法,以及利用异构硬件执行此任务的系统的使用与设计。随后,我们分析了现有解决方案,并指出了其中的不足。最后两章讨论了已识别的问题,并阐述了我们对生态系统未来演进的见解。