Evolutionary change over time in the context of data pipelines is certain, especially with regard to the structure and semantics of data as well as to the pipeline operators. Dealing with these changes, i.e. providing long-term maintenance, is costly. The present work explores the need for evolution capabilities within pipeline frameworks. In this context dealing with evolution is defined as a two-step process consisting of self-awareness and self-adaption. Furthermore, a conceptual requirements model is provided, which encompasses criteria for self-awareness and self-adaption as well as covering the dimensions data, operator, pipeline and environment. A lack of said capabilities in existing frameworks exposes a major gap. Filling this gap will be a significant contribution for practitioners and scientists alike. The present work envisions and lays the foundation for a framework which can handle evolutionary change.
翻译:数据管道中的演化变化是不可避免的,尤其是在数据结构、语义以及管道算子方面。应对这些变化(即实现长期维护)成本高昂。本研究探讨了管道框架中演化能力的必要性。在此语境下,应对演化被定义为包含自感知与自适应两个步骤的过程。此外,本文提出一个概念性需求模型,该模型涵盖自感知与自适应的评价标准,并覆盖数据、算子、管道和环境四个维度。现有框架缺乏上述能力暴露出重大缺陷,填补这一空白将对实践者与科学研究者均具有重要贡献。本研究设想了能够处理演化变化的框架,并为其奠定了基础。