Modern logistics systems tend to generate continuous streams of data from sources such as GPS, IoT sensors, and logistics management systems. The aggregation, processing, and analysis of data have become vital for monitoring operations, optimizing efficiency, and responding quickly to decision making tasks. In this paper, an event-driven MapReduce framework for real-time data processing in logistics environments is presented. This system runs on Kubernetes with Knative and utilizes Apache Kafka as the backbone for communication between the components. This platform is composed of five loosely coupled services that receive, process, and aggregate the incoming data in real-time. Redis is used to preserve workflow metadata, while an AWS S3 service provides persistent storage for the framework. The design is inspired by the MapReduce programming model. It integrates Function-as-a-Service (FaaS) principles with distributed processing techniques that allow configurable scaling based on the workload demands and the underlying hardware. Experimental evaluation shows that the system can scale effectively as the input data volume increases while supporting scale-to-zero, on-demand processing.
翻译:现代物流系统倾向于从GPS、物联网传感器及物流管理系统等源头持续生成数据流。数据的聚合、处理与分析已成为监控运营、优化效率以及快速响应决策任务的关键。本文提出了一种面向物流环境实时数据处理的、基于事件驱动的MapReduce框架。该系统运行于Kubernetes平台并融合Knative技术,同时利用Apache Kafka作为组件间通信的骨干。该平台由五个松散耦合的服务组成,可实时接收、处理并聚合输入数据。Redis用于维护工作流元数据,而AWS S3服务则为框架提供持久化存储。该设计借鉴了MapReduce编程模型,整合了函数即服务(FaaS)理念与分布式处理技术,支持根据工作负载需求及底层硬件进行可配置的弹性伸缩。实验评估表明,该系统能够随输入数据量增长有效扩展,同时支持缩容至零及按需处理。