Energy forecasting research faces a persistent comparability gap that makes it difficult to measure consistent progress over time. Reported accuracy gains are often not directly comparable because models are evaluated under study-specific datasets, time periods, information sets, and scoring setups, while widely used benchmarks and competition datasets are typically tied to fixed historical windows. This paper introduces the Energy-Arena, a dynamic benchmarking platform for operational energy time series forecasting that provides a continuously updated reference point as energy systems evolve. The platform operates as an open, API-based submission system and standardizes challenge definitions and submission deadlines aligned with operational constraints. Performance is reported on rolling evaluation windows via persistent leaderboards. By moving from retrospective backtesting to forward-looking benchmarking, the Energy-Arena enforces standardized ex-ante submission and ex-post evaluation, thereby improving transparency by preventing information leakage and retroactive tuning. The platform is publicly available at Energy-Arena.org.
翻译:能源预测研究长期面临比较性差距问题,导致难以衡量该领域随时间推移的持续进展。由于模型通常基于特定研究数据集、时段、信息集及评分框架进行评估,而广泛使用的基准测试与竞赛数据集又往往绑定固定历史窗口,因此报告的性能提升往往缺乏直接可比性。本文提出能源竞技场(Energy-Arena)——一个面向运营性能源时间序列预测的动态基准测试平台,能够随能源系统的演进提供持续更新的参照基准。该平台采用基于API的开放式提交系统,通过整合运营约束条件标准化挑战定义与截止时间。性能指标通过滚动评估窗口及持续更新的排行榜进行报告。通过从回溯性回测转向前瞻性基准测试,能源竞技场强制执行标准化的事前提交与事后评估机制,从而在防止信息泄露与参数回溯性调整的同时提升透明度。该平台现已通过Energy-Arena.org公开访问。