The integrity of time series data in smart grids is often compromised by missing values due to sensor failures, transmission errors, or disruptions. Gaps in smart meter data can bias consumption analyses and hinder reliable predictions, causing technical and economic inefficiencies. As smart meter data grows in volume and complexity, conventional techniques struggle with its nonlinear and nonstationary patterns. In this context, Generative Artificial Intelligence offers promising solutions that may outperform traditional statistical methods. In this paper, we evaluate two general-purpose Large Language Models and five Time Series Foundation Models for smart meter data imputation, comparing them with conventional Machine Learning and statistical models. We introduce artificial gaps (30 minutes to one day) into an anonymized public dataset to test inference capabilities. Results show that Time Series Foundation Models, with their contextual understanding and pattern recognition, could significantly enhance imputation accuracy in certain cases. However, the trade-off between computational cost and performance gains remains a critical consideration.
翻译:智能电网中的时间序列数据完整性常因传感器故障、传输错误或中断而受到缺失值的影响。智能电表数据中的缺失会扭曲用电分析并阻碍可靠预测,导致技术与经济上的低效。随着智能电表数据在规模和复杂性上的增长,传统技术难以处理其非线性和非平稳模式。在此背景下,生成式人工智能提供了可能超越传统统计方法的有前景的解决方案。本文评估了两种通用大型语言模型和五种时间序列基础模型在智能电表数据插补上的表现,并将其与传统的机器学习和统计模型进行比较。我们在一个匿名公开数据集中引入人工缺失(30分钟至一天)以测试模型的推理能力。结果表明,时间序列基础模型凭借其上下文理解和模式识别能力,在某些情况下能显著提升插补精度。然而,计算成本与性能提升之间的权衡仍是关键考量因素。