As time-series applications grow larger, there is increasing demand for symbolic representations that are compact, accurate, and scalable across many signals and computing resources. Current ABBA-based symbolic approximation methods produce high-quality, shape-preserving representations, but they handle each time series separately and sequentially. This means they do not ensure consistent symbols across different series and cannot fully exploit modern multicore systems and distributed-memory systems. This paper presents a joint symbolic time-series approximation method for large-scale time series. The proposed method decouples local compression from global digitization: (i) time series are partitioned into independent domains that can be compressed in parallel, and (ii) the resulting pieces are digitized using a shared global dictionary. To further improve scalability, we introduce a two-stage parallel digitization scheme, in which aggregation is first performed locally and then merged globally without requiring a full-data reassignment step. Extensive experiments on time-series datasets and large synthetic benchmarks show that our approach maintains competitive reconstruction quality while substantially reducing runtime. These results show that joint symbolic approximation can serve as an efficient, high-level parallel tool for analyzing large-scale temporal data.
翻译:暂无翻译