Maritime port inspection plays a critical role in ensuring safety, regulatory compliance, and operational efficiency in complex maritime environments. However, existing inspection methods often rely on manual operations and conventional computer vision techniques that lack scalability and contextual understanding. This study introduces a novel integrated engineering framework that utilizes the synergy between Large Language Models (LLMs) and Vision Language Models (VLMs) to enable autonomous maritime port inspection using cooperative aerial and surface robotic platforms. The proposed framework replaces traditional state-machine mission planners with LLM-driven symbolic planning and improved perception pipelines through VLM-based semantic inspection, enabling context-aware and adaptive monitoring. The LLM module translates natural language mission instructions into executable symbolic plans with dependency graphs that encode operational constraints and ensure safe UAV-USV coordination. Meanwhile, the VLM module performs real-time semantic inspection and compliance assessment, generating structured reports with contextual reasoning. The framework was validated using the extended MBZIRC Maritime Simulator with realistic port infrastructure and further assessed through real-world robotic inspection trials. The lightweight on-board design ensures suitability for resource-constrained maritime platforms, advancing the development of intelligent, autonomous inspection systems. Project resources (code and videos) can be found here: https://github.com/Muhayyuddin/llm-vlm-fusion-port-inspection
翻译:海事港口检测在确保复杂海事环境中的安全性、法规遵从性和运行效率方面发挥着关键作用。然而,现有检测方法通常依赖人工操作和传统计算机视觉技术,缺乏可扩展性和上下文理解能力。本研究提出一种新颖的集成工程框架,利用大语言模型与视觉语言模型之间的协同效应,通过协作的空中与水面机器人平台实现自主海事港口检测。该框架以LLM驱动的符号规划取代传统的状态机任务规划器,并通过基于VLM的语义检测改进感知流程,从而实现上下文感知的自适应监测。LLM模块将自然语言任务指令转换为可执行的符号计划,其中包含编码操作约束并确保UAV-USV安全协调的依赖图。同时,VLM模块执行实时语义检测与合规性评估,生成具有上下文推理的结构化报告。该框架在扩展的MBZIRC海事仿真器中通过真实港口基础设施进行了验证,并进一步通过实际机器人检测试验进行评估。轻量化的机载设计确保其适用于资源受限的海事平台,推动了智能自主检测系统的发展。项目资源(代码与视频)可见:https://github.com/Muhayyuddin/llm-vlm-fusion-port-inspection