The availability of historical data related to electricity day-ahead prices and to the underlying price formation process is limited. In addition, the electricity market in Europe is facing a rapid transformation, which limits the representativeness of older observations for predictive purposes. On the other hand, machine learning methods that gained traction also in the domain of electricity price forecasting typically require large amounts of data. This study analyses the effectiveness of encoding well-established domain knowledge to mitigate the need for large training datasets. The domain knowledge is incorporated by imposing a structure on the price forecasting problem; the resulting accuracy gains are quantified in an experiment. Compared to an "unstructured" purely statistical model, it is shown that introducing intermediate quantity forecasts of load, renewable infeed, and cross-border exchange, paired with the estimation of supply curves, can result in a NRMSE reduction by 0.1 during daytime hours. The statistically most significant improvements are achieved in the first day of the forecasting horizon when a purely statistical model is combined with structured models. Finally, results are evaluated and interpreted with regard to the dynamic market conditions observed in Europe during the experiment period (from the 1st October 2022 to the 30th April 2023), highlighting the adaptive nature of models that are trained on shorter timescales.
翻译:与日前电价及其底层价格形成过程相关的历史数据可用性有限。此外,欧洲电力市场正经历快速转型,这限制了较早期观测数据用于预测的代表性。另一方面,在电价预测领域日益受到关注的机器学习方法通常需要大量数据。本研究分析了编码完善领域知识以缓解对大规模训练数据集需求的效能。通过向电价预测问题施加结构约束来融入领域知识;在实验中量化了由此带来的精度提升。与“非结构化”纯统计模型相比,引入负荷、可再生能源注入及跨国交换的中期电量预测,并配合供给曲线估计,可使白天时段的NRMSE降低0.1。当纯统计模型与结构化模型结合时,预测时间窗首日取得了统计上最显著的改进。最后,根据实验期间(2022年10月1日至2023年4月30日)欧洲观察到的动态市场条件对结果进行了评估与解读,凸显了基于较短时间尺度训练的模型的自适应特性。