Advances in sensor networks have enabled real-time stream discharge monitoring, yet persistent sensor malfunctions limit data utility. Manual quality control by expert hydrologists cannot scale with networks generating millions of measurements annually. We introduce HydroGEM, a foundation model for continental-scale streamflow quality control designed to support human expertise. HydroGEM uses self-supervised pretraining on 6.03 million clean sequences from 3,724 USGS stations to learn general hydrological representations, followed by fine-tuning with synthetic anomalies for detection and reconstruction. A hybrid TCN-Transformer architecture (14.2M parameters) captures both local and long-range temporal dependencies, while hierarchical normalization handles six orders of magnitude in discharge. On held-out observations from 799 stations with 18 synthetic anomaly types grounded in USGS standards, HydroGEM achieves F1=0.792 for detection and 68.7% reconstruction error reduction, outperforming the strongest baseline by 36.3%. For cross-national validation on 100 Environment and Climate Change Canada stations using tolerant evaluation with a plus or minus 24-hour buffer, HydroGEM achieves Tolerant F1=0.70 with 90.1% segment-level event detection, demonstrating cross-national generalization. The model maintains consistent detection across correction magnitudes and aligns with operational seasonal patterns, with peak flagging during winter ice-affected periods matching hydrologists' correction behavior. Architectural separation between simplified training anomalies and complex test anomalies confirms that performance reflects learned hydrometric principles rather than pattern memorization.
翻译:传感器网络的进步实现了径流排放的实时监测,但持续的传感器故障限制了数据可用性。由水文专家进行的人工质量控制无法适应每年产生数百万次测量的网络规模。我们提出了HydroGEM,一种用于大陆尺度径流质量控制的基础模型,旨在辅助人类专业知识。HydroGEM通过对来自3,724个美国地质调查局(USGS)站点的603万条清洁序列进行自我监督预训练,学习通用的水文表征,随后利用合成异常进行微调以实现检测与重建。混合TCN-Transformer架构(1420万参数)同时捕捉局部与长程时间依赖性,而分层归一化处理了径流量六个数量级的动态范围。在基于USGS标准生成的18种合成异常类型、来自799个站点的保留观测数据上,HydroGEM实现了检测F1分数=0.792,重建误差降低68.7%,优于最强基线36.3%。在100个加拿大环境与气候变化部站点上进行的跨国验证中,采用±24小时缓冲的容错评估,HydroGEM达到容错F1分数=0.70,片段级事件检测率为90.1%,展现了跨国泛化能力。该模型在不同校正幅度上保持一致的检测性能,并与实际季节性模式相符——在冬季受冰情影响期间达到标记峰值,这与水文专家的校正行为一致。简化训练异常与复杂测试异常之间的架构分离证实,模型性能反映了所学的水文测量原理而非模式记忆。