Automated river gauge plate reading using a hybrid object detection and generative AI framework in the Limpopo River Basin

Accurate and continuous monitoring of river water levels is essential for flood forecasting, water resource management, and ecological protection. Traditional hydrological observation methods are often limited by manual measurement errors and environmental constraints. This study presents a hybrid framework integrating vision based waterline detection, YOLOv8 pose scale extraction, and large multimodal language models (GPT 4o and Gemini 2.0 Flash) for automated river gauge plate reading. The methodology involves sequential stages of image preprocessing, annotation, waterline detection, scale gap estimation, and numeric reading extraction. Experiments demonstrate that waterline detection achieved high precision of 94.24 percent and an F1 score of 83.64 percent, while scale gap detection provided accurate geometric calibration for subsequent reading extraction. Incorporating scale gap metadata substantially improved the predictive performance of LLMs, with Gemini Stage 2 achieving the highest accuracy, with a mean absolute error of 5.43 cm, root mean square error of 8.58 cm, and R squared of 0.84 under optimal image conditions. Results highlight the sensitivity of LLMs to image quality, with degraded images producing higher errors, and underscore the importance of combining geometric metadata with multimodal artificial intelligence for robust water level estimation. Overall, the proposed approach offers a scalable, efficient, and reliable solution for automated hydrological monitoring, demonstrating potential for real time river gauge digitization and improved water resource management.

翻译：河流水位的准确连续监测对于洪水预报、水资源管理和生态保护至关重要。传统水文观测方法常受限于人工测量误差和环境约束。本研究提出一种混合框架，集成了基于视觉的水位线检测、YOLOv8姿态尺度提取以及大型多模态语言模型（GPT-4o和Gemini 2.0 Flash），用于实现自动水尺读数识别。该方法包含图像预处理、标注、水位线检测、刻度间隙估计和数字读数提取的连续阶段。实验表明，水位线检测实现了94.24%的高精度和83.64%的F1分数，而刻度间隙检测为后续读数提取提供了准确的几何校准。引入刻度间隙元数据显著提升了大型语言模型的预测性能，其中Gemini Stage 2在最优图像条件下取得了最高准确度，其平均绝对误差为5.43厘米，均方根误差为8.58厘米，R平方值为0.84。结果突显了大型语言模型对图像质量的敏感性——图像质量下降会导致误差增大，并强调了将几何元数据与多模态人工智能相结合对于稳健水位估计的重要性。总体而言，所提出的方法为自动化水文监测提供了可扩展、高效且可靠的解决方案，展现了实时水尺数字化和改善水资源管理的潜力。