This paper presents Odyssey, a novel distributed data-series processing framework that efficiently addresses the critical challenges of exhibiting good speedup and ensuring high scalability in data series processing by taking advantage of the full computational capacity of modern clusters comprised of multi-core servers. Odyssey addresses a number of challenges in designing efficient and highly scalable distributed data series index, including efficient scheduling, and load-balancing without paying the prohibitive cost of moving data around. It also supports a flexible partial replication scheme, which enables Odyssey to navigate through a fundamental trade-off between data scalability and good performance during query answering. Through a wide range of configurations and using several real and synthetic datasets, our experimental analysis demonstrates that Odyssey achieves its challenging goals.
翻译:本文提出Odyssey,一种新颖的分布式数据序列处理框架,通过充分利用由多核服务器组成的现代集群的全部计算能力,高效应对数据序列处理中展现良好加速比和确保高可扩展性的关键挑战。Odyssey解决了设计高效且高度可扩展的分布式数据序列索引所面临的若干挑战,包括高效调度和负载均衡,无需付出移动数据的高昂代价。同时,它支持一种灵活的局部复制方案,使Odyssey能够驾驭查询响应过程中数据可扩展性与优良性能之间的基本权衡。通过广泛配置及使用多个真实与合成数据集进行的实验分析表明,Odyssey实现了其具有挑战性的目标。