Current network data telemetry pipelines consist of massive streams of fine-grained Key Performance Indicators (KPIs) from multiple distributed sources towards central aggregators, making data storage, transmission, and real-time analysis increasingly unsustainable. This work presents a generative AI (GenAI)-driven sampling and hybrid compression framework that redesigns network telemetry from a goal-oriented perspective. Unlike conventional approaches that passively compress fully observed data, our approach jointly optimizes what to observe and how to encode it, guided by the relevance of information to downstream tasks. The framework integrates adaptive sampling policies, using adaptive masking techniques, with generative modeling to identify patterns and preserve critical features across temporal and spatial dimensions. The selectively acquired data are further processed through a hybrid compression scheme that combines traditional lossless coding with GenAI-driven, lossy compression. Experimental results on real network datasets demonstrate over 50$\%$ reductions in sampling and data transfer costs, while maintaining comparable reconstruction accuracy and goal-oriented analytical fidelity in downstream tasks.
翻译:当前网络数据遥测管道包含来自多个分布式源向中央聚合器传输的大规模细粒度关键性能指标(KPI)流,导致数据存储、传输和实时分析愈发不可持续。本文提出一种生成式人工智能(GenAI)驱动的采样与混合压缩框架,以面向目标的视角重新设计网络遥测系统。不同于被动压缩全观测数据的传统方法,本方法基于信息与下游任务的相关性指导,联合优化观测内容与编码方式。该框架整合了采用自适应掩码技术的自适应采样策略与生成式建模,通过识别时空维度模式保留关键特征。选择性获取的数据进一步通过融合传统无损编码与GenAI驱动有损编码的混合压缩方案进行处理。在真实网络数据集上的实验表明,该方法在保持下游任务等效重构精度与面向目标分析保真度的同时,将采样和数据传输成本降低超过50%。